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PATENT APPLICATION 
ATTY REP NO. PP20663.OO2 

* * 

IMMUNOGENIC COMPOSITIONS FOR STREPTOCOCCUS PYOGENES 
This application incorporates by reference in its entirety U.S. provisional patent application 
No. 60/491,822, filed on July 31. 2003. 

■ 

TECHNICAL FIELD 

■ 

5 This invention is in the fields of immunology and vaccinology. In particular, it relates to 

antigens derived from Streptococcus pyogenes and their use in immunisation. All documents cited 
* herein are incorporated by reference in their entirety. 

> 

BACKGROUND ART 

Group A streptococcus ("GAS", S.pyogenes) is a frequent human pathogen, estimated to 

* 

10 be present in between 5-15% of normal individuals without signs of disease. • When host defences 
are compromised, or when the organism is able to exert its virulence, or when it is introduced to 
vulnerable tissues or hosts, however, an acute infection occurs. Related diseases include 
puerperal fever, scarlet fever, erysipelas, pharyngitis, impetigo, necrotising fasciitis, myositis and 

• * 

* 

streptococcal toxic shock syndrome. 

1 5 Although S.pyogenes may be treated using antibiotics, a prophylactic vaccine to prevent 

the onset of disease is desired. Efforts to develop such a vaccine have been ongoing for many 
decades. While various GAS vaccine approaches have been suggested and some approaches are 

► 

currently in clinical trials, to date, there are no GAS vaccines available to the public. 

It is an object of the invention to provide further and improved compositions for providing 

» • 

20 immunity against GAS disease and/or infection. The compositions are based on a combination of two 
. or more (eg. three or more) GAS antigens. . 

DISCLOSURE OF THE INVENTION 

♦ 

Applicants have discovered a group of thirty GAS antigens that are particularly suitable for 
immunisation purposes, particularly when used in combinations. In addition, Applicants have 
25 . identified a GAS antigen (GAS 40) which is particularly immunogenic used either alone or in 
. combinations with additional GAS antigens. 

The invention therefore provides an immunogenic composition comprising GAS 40, a 
fragment thereof or a polypeptide having sequence identity thereto. The invention further includes an 
immunogenic composition comprising a combination of GAS antigens, said combination consisting 
30 of two to ten GAS antigens, wherein said combination includes GAS 40 or a fragment thereof or a 
polypeptide having sequence identity thereto. Preferably, the combination consists of three, four, 

five, six, or seven GAS antigens. Still more preferably, the combination consists of three, four, or five 

* 

GAS antigens. 

* 

The invention also provides an immunogenic composition comprising a combination of GAS 
35 antigens, said combination consisting of two to thirty-one GAS antigens of a first antigen group, said 
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first antigen group consisting of: GAS 1 17, OAS 130, OAS 277, GAS 236, OAS 40, OAS 389, GAS 
504, GAS 509, GAS 366, GAS 159, GAS 217, GAS 309, GAS 372, GAS 039, GAS 042, GAS 058, 
GAS 290, OAS 51 1, GAS 533. GAS 527, GAS 294, GAS 253, GAS 529, GAS 045, GAS 095, GAS 
193, GAS 137, GAS 084, GAS 384, GAS 202, and GAS 057. These antigens are referred to herein as 
5 the 'first antigen group*. Preferably, the combination of GAS antigens consists of three, four, five, 
six, seven, eight, nine, or ten GAS antigens selected from the first antigen group. Preferably, the 

* 

combination of OAS antigens consists of three, four, or five GAS antigens selected from the fust 
antigen group. 

GAS 39, GAS 40, GAS 57, GAS 1 1 7, GAS 202, GAS 294, GAS 527, OAS 533, and GAS 
10 51 1 are particularly preferred GAS antigens. Preferably, the combination of GAS antigens includes 

* • 

either or both of GAS 40 and GAS 117. Preferably, the combination includes GAS 40. 

Representative examples of some of these antigen combinations are discussed below. 

• ■ 

The combination of GAS antigens may consist of three GAS antigens selected from the first 
antigen group. Accordingly, in one embodiment, the combination of GAS antigens consists of GAS 
15 40, GAS 1 1 7 and a third GAS antigen selected from the first antigen group. Preferred combinations 
include GAS 40, GAS 1 1 7 and a third GAS antigen selected from the group consisting of GAS 39, 
GAS 57, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 511. 

In another embodiment, the combination of GAS antigens consists of GAS 40 and two 
additional GAS antigens selected from the first antigen group. Preferred combinations include GAS 
20 40 and two GAS antigens selected from the group consisting of GAS 39, GAS 57, GAS 1 17, GAS 
202, GAS 294, GAS 527, GAS 533, and GAS 51 1. In another embodiment, the combination of GAS 
antigens consists of GAS 1 17 and two additional GAS antigens selected from the first antigen group. 

The combination of GAS antigens may consist of four GAS antigens selected from the first 
antigen group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 1 7 
25 and two additional GAS antigens selected from the first antigen group. Preferred combinations 

♦ 

include GAS 40, GAS 1 1 7, and two GAS antigens selected from the group consisting of GAS 39, ■ 
GAS 57, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 51 1. 

In another embodiment, the combination of GAS antigens consists of GAS 40 and three 
additional GAS antigens selected from the first antigen group. Preferred combinations include GAS 
30 40 and three additional GAS antigens selected from the group consisting of GAS 39, GAS 57, GAS 
117, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 5 1 i . In one embodiment, die combination of 
GAS antigens consists of GAS 1 1 7 and three additional antigens selected from the first antigen group. 

The combination of GAS antigens may consist of five GAS antigens selected from the first 
antigen group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 117 
35 and three additional GAS antigens selected from the first antigen group. Preferred combinations 
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include GAS 40, GAS 117 and three additional GAS antigens selected from the group consisting of 
GAS 39. GAS 57, GAS 202, GAS 294, GAS 527, GAS S33, and GAS 51 1. 

In another embodiment, the combination of GAS antigens consists of GAS 40 and four 
additional GAS antigens selected from the first antigen group. Preferred combinations include GAS 
5 40 and four additional GAS antigens selected from the group consisting of GAS 39, GAS 57, GAS 
1 1 7, GAS 202, GAS 294, GAS 527, GAS 533, and GAS 5 11 . In one embodiment, the combination of 
GAS antigens consists of GAS 1 17 and four additional GAS antigens selected from the first antigen 
group. 

• » 

The combination of GAS antigens may consist of eight GAS antigens selected from the first 
1 0 antigen group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 1 7 
and six additional GAS antigens selected from the first antigen group. In one embodiment, the 
combination of GAS antigens consists of GAS 40 and seven additional GAS antigens selected from 
the first antigen group. In one embodiment, the combination of GAS antigens consists of GAS 1 1 7 
and seven additional GAS antigens selected from the first antigen group. 

1 5 The combination of GAS antigens may consist of ten GAS antigens selected from the first 

antigen group. In one embodiment, the combination of GAS antigens consists of GAS 40, GAS 1 1 7 
and eight additional GAS antigens selected from the first antigen group. In one embodiment, the 
combination of GAS antigens consists of GAS 40 and nine additional GAS antigens selected from the 
first antigen group. In one embodiment, the combination of GAS antigens consists of GAS 1 17 and 

20 nine additional GAS antigens selected from the first antigen group. 

Each of the GAS antigens of the first antigen group are described in more detail below. 
Genomic sequences of at least three GAS strains are publicly available. The genomic sequence of an 
Ml GAS strain is reported at Ref. 1. The genomic sequence of an NO GAS strain is reported at Ref. 
2. The genomic sequence of an Ml 8 GAS strain is reported at Ref. 3. Preferably, the GAS antigens 
25 of the invention comprise polynucleotide or amino acid sequence of an Ml , M3 or Ml 8 GAS strains. 
More preferably, the GAS antigens of the invention comprise a polynucleotide or amino acid 
sequence of an Ml strain. 

■ 

(1) GAS 117 

GAS 1 17 corresponds to Ml GenBank accession numbers GI: 13621679 and GI: 15674571, to M3 
30 GenBank accession number GI:21909852, to M18 GenBank accession number GI: 19745578, and is 
also referred to as 4 Spy0448' (Ml), 'SpyNBjm^ (M3), and 'SpyM18JW9r (M18). Examples of 
amino acid and polynucleotide sequences of GAS 1 17 of an Ml strain are set forth below: 

SEQ ID NO: 1 

MTLKKHYYLLSLLALVTVGAA F I LDGYQND 

35 LGRHYSS YYYYNLRTVMGLSSEQDI BKHYBELKNKLHDMYNHY 

SEQ ID NO: 2 

ATGACACTAAAAAAACACTATTATCTTCTCAGCCTGCTAGCTCTTGTAACGGTTGGTGCTC 
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CA^GCCAGAGTGTCAGTGCACAAGTTTATAOCAAT^ 
AC^CCTGCAATATAGTAAAGACAACGCACAACTTCAATT^ 

CTAGGGAGACACTACTCTAGCTATTATTACTACAACCTAAGAACCGTTATGCGACTATCAAGT^ 
ACATTGAAAAACACTATGAAGAGCTTMGAACAAGTTACATGATATC^ 

'5 

Preferred GAS 1 17 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%,.95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: I ; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 1, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

10 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 1 1 7 proteins include variants (eg. allelic 
' variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 1 . Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 1 . Other preferred fragments lack one or more amino acids 
(e.g. 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 1. For 

1 5 example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 1 
is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide; of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

* * 

(2) GAS 130 

GAS 130 corresponds to Ml GenBank accession numbers GI:13621794 and GM5674677, to M3 
20 GenBank accession number GI: 21909954, to M18 GenBank accession number GI: 19745704, and is 
also referred to as 4 Spy0591 ' (Ml), 'SpyM3_0418' (M3), and 'SpyMlfc JJ660* (Ml 8). GAS 130 has 
potentially been identified as a putative protease. Examples of amino acid and polynucleotide 

sequences of GAS 130 of an Ml strain are set forth below: 

• 

SEQ ID NO: 3 

25 MS HMKKR PEVLS PAGTLB KLKVAI DYGADAVFVGGQAYGLRS RAGN FSMEELQBGI DYAHARGAKVYVAA 
MMVTHEGNEI GAGEWFRQLROMGLDAVI VSDPALI VICSTEAPGLB IHLSTQAS STNYBTFE FWKAMGLT 
RWLAREVNMAELAEI^KRTDVEI RAFVHGAMCI S YSGRCVLSNKMSHRDANRGGCSQSCRWKYDLYDMP 
PGGBRRS LKGEI PED YSMSS VDMCMI DHI PDLI ENGVDSLKI EGRMKS I HYVSTVTNCYKAAVGAYMBS P 
EAFYAI KEBLI DBLWKVAQRELATGFYYGI PTBNBQLFGARRKI PQYKFVGE WAFDSASMTAT I RQRNV 

30 IMEGDRI ECY GPGFRHFBTWKDLHDADGQKI DRAPN PMELLT I SLPREVKPGDMI RAC KEGLVNLYQKD 
GTSKTVRT 

SEQ ID NO: 4 

ATGTCACATATGAAAAAATOTCCCGAGGTCTTATCACXrrG^ 
35 TTGACTATGGCGCAGATGCTGTTTTTGTTGGAGGGCAGGCCTA 

CTCTATGGAAGAATTGCAAGAAGGCATTGATTATGCACATGCGCGT^ 

AACATGGTTACCCACGAAGGGAACGAAATTGGTGCGGGCGAGTGGTT^ 

TTGATGCGGTCATTGTTTttGATCCAGCCTTGACT 

TCATTTGTCAACGCAAGCTTCATCTACCAATTACGAGACCT 
40 CGAGTTGTTTTAGCTCGCGAGGTTAATATGGCCGAGTTAGCAGLAAA 

TTGAAGCCTTTGTCC^TGGAGCC^TGTGTATCT 

TCACCGTGATGCCAACAGGGGCGGCTGCTCACAGTCTTC 

TTTGGAGGAGAGCGCCGCTCCTTAAAAGGGGAAATTCCAGAAGACT 
GTATGATTGACCATATTCCTGACCTGATTGAA^ 

45 ATCTATCCACTACXSTCTCAACCGTAACCAACTGTTACAAGGCGGCTGTAC^ 
GAAGCTTTTTATGCTATCAAAGAGGAATT»TTGA 
CAGGTTTTTACTATGGTATCCCAACTGAAAATGAACAAT^ 
TAAATTTGTCGGAGAAGTAGTIGCCTTTGACTCAGCT 

ATCATGGAAGGCGATCGGATTGAATGTTATGGACC^GO • 
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TACATOATCOGGATCGCCAAAAGATTQACCGTGCCC 

GAQAGAAGTTAAGCCAGGGGATATGATTAGGGCTTCCAAGGMGGTCTGG 

GGCACCAGTAAAACTGTTAGAACATAO 

5 . Preferred GAS 1 30 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 3; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 3, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, or more). Tliese GAS 1 30 proteins include variants (eg. 

10 allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 3. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 3. Other preferred fragments lack one or more amino 

* 

acids (eg. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 3. Other fragments Omit one or more domains of the protein (eg. omission of a signal peptide, of 
15 a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(3)GAS277 

GAS 277 corresponds to Ml GenBank accession numbers Gl: 13622962 and GI: 15675742, to M3 
GenBank accession number GI: 2191 1206, to M18 GenBank accession number GI: 19746852, and is 
also referred to as 'Spyl939' (Mi), *SpyM3 J670* (M3), and *SpyM18 _2006' (M18). Amino acid 
20 and polynucleotide sequences of GAS 277 of an Ml strain are set forth below: 

■ 

SEQ ID NO: 5 

MTTMQKT I SLLS LALLIGLLGTSGKA I S VYAQDQHTDNVI AEST I SQVS VEASMRGTB P Y I DATVTTDOP 

VRQPTQATITL10>ASPNTINSWVYTMAA^ . 
QNKARKTPTNMQQKDTSKA^ 

25 ASNSQKNGSNKTKMLVDKEEVKPTSKRGFPWVLLGLVVSLAAGLFIAIQKVSRRK 

SEQIDNO:6 

ATGAG^CTATGCAAAAAAdAATTAGCTTATTATC^CTA 
GCAAAGCCATATCTGTGTATGCACAAGATCAGCACACTGAT^ 
30 GGTCAGTGTTGAAGCCAGTATGCGTGGAACAGAACCTTATAT 
• GTCAGACAACCAACTCAGGCAACGATAACACTTAAAGACGCT 
ATACTATCGCAGCGCAACAGCGTCteTTTTACAGCTT^ 
TCATGTAACTGTCACCGTTCATACTCAAGAAAAGGCA 

CAAAACAAAGCTAGAAAAACACCAACTAATATGCAACAAAAGGATACTTCTAAA 
35 TCGATCTAGACACAAAAGCTCAAACAAATCAATCAGCTAACCM 

GAGATCAGCTACTAATCATCGATCAACTTCCTTAAAGOGATCTACTAAAAATGAGAAACTO 
GCTAGTAATAGCCAAAAAAACGGTAGCAACAAGACAAAAATGCTAGTGGAC 
CTTCAAAAAGAGGATTCCCTTGGGTCTTATTAGGTCTAGTAGTCAGT^ 
TATTCAAAAAGTATCTAGACGAAAATAA 

40 

Preferred GAS 277 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 5; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ED NO: 5, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 1 8, 20, 25, 
45 30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 277 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 5. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 5. Other preferred fragments lack one or more amino acids 

< 
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(e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 5. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 5 
is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
5 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



(4) GAS 236 

GAS 236 corresponds to Ml GenBank accession numbers GI: 13622264 and GI : 156751 06, M3 
GenBank accession number GI: 21910321, and to M18 GenBank accession number GI: 19746075, 
and is also referred to as 'Spyl 126' (Ml), , SpyM3_0785* (M3), and 'SpyM18J087* (M18). Amino 
10 acid and polynucleotide sequences of GAS 236 from an M 1 strain are set forth below: 

. « 

SEQ m NO: 7 

MTQMNYTOKVKRVAIIANGKYQSKRV^ 

DKVRFVG I HTGHLGFYTDYRDPEVDKLI DNLRKDKGEQI S Y PI LKVAI TLDDGR WKARALNEATVKR I B 
KTMVADVI INHVKFBSFRGDGI SVSTPTGSTAYNKSLGGAVLHPTI BALQLTE I SSLNNRVFRTLGSSI I 
15 I PKKDKI BLVPKRLGI YTI SI DNKTYQLKNVTKVBYFI DDBKI HFVSS PSHTSFWBRVKDAFI GE I DS 

SEQ ID NO: 8 

ATGACACAGATCAATTATACAGCT 

AACGCG7CGCCTCCAAACTTTTCTCCGTATTTAAAGATGATCCTGATTTCT 
20 GGATATTGTGATTTCTATTGGCGGAGATGGGATGCT 

GATAAGGTACGTTTTGTAGGAATCCACACCGGTCATCTTGGCTTTTAT^ 

TTGATAAATTAATTGATAATTTAAGAAAAGACAAGGGAGAACAAATCT 

TATTACTTTAGATGATGGTCGTGTGGTTAAAGCGCGTC 

AAAACGATGGTAGCAGATGTTATTATTAACCATGTCAAATTTGAAAGCTTC 
25 TATCGACCCCGACAGGGAGCACAGCCTACAATAAATCTTTAGGTC^ 

AGCGCTGCAATTGACGGAAATTTCCAGTCTTAATAATCGTGTCTTTAGAAC 

ATTCCCAAAAAAGATAAGATTGAGTTAGTGCCAAAACGATTAGGAATTTATACCA 

AAACCTATCAGTTAAAAAATGTGACGAAGGTGGAGTATTre 
. CrCTCCGAGTCATACGAGCTTTTGGGAAAGGGTCAAGGATGCC . 

30 

Preferred GAS 236 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 7; and/or (b) which is a fragment of at least n 

» « 

consecutive amino acids of SEQ ID NO: 7, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
35 30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 236 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc. ) of SEQ ID NO: 7. Preferred fragments of (b) 

■ 

comprise an epitope from SEQ ID NO: 7. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 7. For 
40 example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 7 
is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(5) GAS 040 

* 

GAS 040 corresponds to Ml GenBank accession numbers GI: 13621545 and GI: 15674449, to M3 
45 . GenBank accession number GI: 21909733, to M18 GenBank accession number GI: 19745402, and is 
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also referred to as '5py0269* (Ml), 'SpyMa.OW (M3), 'SpyMI8_0256' (MI8) and 'prgA'. OAS 
040 has also been identified as a putative surface exclusion protein. Amino acid and polynucleotide 
sequences of GAS 040 from an Ml strain are set forth below: 

SEQD)NO:9 

5 MDLEQTKPNQVKQKIALTSTIALLSA SVGVSHQVKADDRASGBTKASNTHDDSLPKPBTIQBAKATI DAV 
BKTLSQQKABLTBLATALTKTTAB I NHUCEQQDNBQKALTSAQB IYTlTrLASSBBTLLAQGAEKQRBLTA 
TBTBLHNAQADQHSKETALS EQKAS I S ABTTRAQDLVEQVKTS BQN I AKLNAM I SN PDA I TKAAQTANDN 
TKALSSELEKAKADLE^QKAKVKKQLTEEIAAQK^ 

PLBBLKKLEASGY IG3 AS YNNYYKBHADQI I AKASPGNQLNQYQDI PADRNRFVDPDNLTPBVQNBLAQF 
1 0 AAHM I NSVRRQLGL P PVTVTAGSQB PARLLSTS Y KKTHGNTRPS FVYGQPGVSGHYGVG PHDKTI I BDSA 
GASGLI RNDDNMYBN I GAFNDVHTVNGI IQ^GI YDSI KYMLPTDHLHGNTYGHAINFLRVDKHNPNAPVYli 
GFSTSWGSLNEHFVMFPESNIANHQRFNKTPIKAVG . 
KQEADIMAAQAKVSQLQGKLASTLKQSDSI^^ 

SL KAALHQTEAIAEQAAAR VTALVAKKAHLQ YLROF KLN PNRLQVI RBR I DNTKQDLAXTTS SLLNAQBA 
1 5 IAALQAKQSSLEATIATTBHQLTLLKTLANEKEYRHL^ DTTPLV 
QBMVKBTKQLLEASARIJUVBNTSLVABALVGQTSEMVASNAIVSKIT 
SDVDBSTQRALKAGVVM1AAVGLTGPRFRKBSK 



SEQIDNO:10 

20 ATGGACTTAGAACAAACGAAGCCAAACCAAGTTAAGCAGAAAATO 

TGAGTGCCA GTGTAGGCGTATCTCACCAAGTCAAAGCAGATGATAGAGCCTCAGGAGA 
TAATACTCACGACGATAGTTTACCAAAACC^GAAAGAATTC^ 

GAAAAAACTCTCAGTCAACAAAAAGCAGAACTGACAGAGCTTGCTAC 
AAATCAACCACTTAAAAGAGCAGCAAGATAATGAACAAAAAGCTT^ 

25 TAATACTCTTGCAAGTAGTGAGGAGACGCTATTAGCCCAAGGAGCCGAACATCAAAG^ 
ACTGAAACAGAGCTTCATAATGCTCAAGCAGATCAACATTCAAAAGAGACT 
CTAGCATTI^GCAGAAACTACTCGAGCTCAAGATTTAGTGGAAC^ 
TGCTAAGCTCAATGCTATGATTAGCAATCCTGATGCTATCACTAAAGCAGCTC^ 
ACAAAAGCATTAAGCTCAGAATTGGAGAAGGCTAAAGCTGACTTAGAAAATCAAA 

30 AGCAATTGACTGAAGAGTTGGCAGCTCAGAAAGCTC 

TAAATCCTCAGCTCCGTCTACTCAAGATAGCATTGTGGGTAAT^ 
CCTCTTGAAGAACTTAAAAAATTAGAAGCTAGTGGTT^ 
AAGAGCATGCAGATCAAATTATTGCCAAAGCTAGTCCAGGTAATCAATT 
AGCAGATCGTAATCGCTTTGTTGATCCCGATAATTTGACACCAGAAGTC 

35 GCAGCTCACATGATTAATAGTGTAAGAAGACAATTAGGTCTACCACCAGT^ 
AAGAATTTGCAAGATTACTTAGTACCAGCTATAAGAAAACTCATGGTAA 
CGGACAGCCAGGGGTATCAGGGCATTATGGTGTTGGGCCTCATGATAAAACTA 
GGAGCGTCAGGGCTCATTCGAAATGATGATAAC^TGTACGAGA^ 
. CTGTGAATGGTATTAAACGTGGTATTTATGACAGTATCAAGTATATC 

40 AAATACATACGGCCATGCTATTAACTTTTTACGTGTAGA 
GGATTTTCAACCAGCAATGTAGGATCTTTGA^ 
ACCATCAACGCTTTAATAAGACCCCTATAAAAGCCGTTGGAAGTO 
C^CTGTATCTGATACTATTGCAGCX^TCAAAGGAAAAGTAAGCTCATT 
CATCAAGAAGCTGATATTATGGCAGCCCAAGCTAAAGTAAGTCAAGTTCAAG^ 
. 45 TTAAGCAGTCAGACAGCITAAaTCTCCAAGTC^ 

ATTACTAGCAGrtAAAGCAAMCAAGCACAACTCGAAGCTACTCGTGATCAATCATTAGCTAAGCT 
TCGTTGAAAGCCGCACTGCACCAGACAGAAGCCTTAG 
TGGCTAAAAAA(^TCATTTGCAATATCTAAGGGACTTTAA^ 
TCAGCGCATTGATAATACTAAGCAAGATTTGGCTAAAACTA 

50 TTAGCAGCCITACAAGCTAAACAAAGCAGTCTAGAAGCTACT^ 

TGCTTAAAACCTTAGCTAACGAAAAGGAATATCGCCACTTAGACGAAGA 
GCMGTAGCTCCACCTCTTACGGGCGTAAAACCGCTATCATATAGTAAGATAGATACTACTC 
CAAGAAATGGTTAAAGAAACGAAACAACTATTAGAAGCTTCAGCAAGAT^ 
TTGTAGCAGAAGCGCTTGTTGGCCAAACCTCTGAAATGG 

55 ATCTTCGATTACTCAGCCCTCATCTAAGACATCTTATGGCTCAGGATCTTCT 

TCTGATGTTGATGAAAGTACTCAAAGA GCTCTTAAAGCAGGAGTro 
CAGGATTTAGGTTCCGTAAGGAATCTAAGTGA . 
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ATTY WEF NO. PP20663.002 

* 

' Preferred GAS 040 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 9; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 9, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

5 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250 or more). These GAS 040 proteins include variants 
(e.g. allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 9. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 9. Other preferred fragments lack one or 
more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 

1 0 ID NO: 9. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 

■ 

SEQ ID NO: 9 is removed. As another example, in one embodiment, the underlined amino acid 
sequence at the C-terminus of SEQ ID NO: 9 is removed. Other fragments omit one or more domains 
of die protein (e.g. omission of a signal peptide, of a cytoplasmic domain, of a transmembrane 
domain, or of an extracellular domain). 

1 5 Further illustration of domains within GAS 40 is shown in FIGURES 1 and 2. As shown in these 
figures, GAS 40 contains a leader peptide sequence within amino acids 1 t 26, a coiled-coil region 
within amino acids 58 - 261 , a coiled coil region within amino acids 556 - 733, a leucine zipper 
region within amino acids 673 - 701 and a transmembrane region within amino acids 855 - 866. 

■ 

The coiled-coil regions of GAS 40 are likely to be involved in the formation of oligomers such as 
20 dimers or trimers. Such oligomers could be homomers (containing two or more GAS 40 proteins 
oligomerized together) or heteromers (containing one or more additional GAS proteins oligomerized 
withGAS40). 

Accordingly, in one embodiment, the combinations of the invention include a GAS 40 antigen in the 
form of an oligomer. The oligomer may comprise two more GAS 40 antigens or fragments thereof, or . 
25 it may comprise GAS 40 or a fragment thereof oligomerized to a second GAS antigen. Preferably, a 
GAS 40 fragment used within an oligomer includes a portion of one of the coiled coil or leucine 
zipper domains. 

a 

(6) GAS 389 

GAS 389 corresponds to Ml GenBank accession numbers GI: 13622996 and GI: 1 5675772, to M3 
30 GenBank accession number GI: 2191 1237, to M18 GenBank accession number GI: 19746884, and is 
also referred to as *Spyl981' (Ml),*SpyM3_170r(M3),*SpyM18 2045' (M18) and 'relA*. GAS 
389 has also been identified as a (p)ppGpp synthetase. Amino acid and polynucleotide sequences of 

■ • 

GAS 389 from an Ml strain are set forth below: 
SEQ ID NO: 11 

• 35 MRNEMAKI MNVTGEEV I ALAATYMT KADVAFVAKALA YATAAH FYQVRKSGB PY I VH PI QVAGI LADLHL 
DAVTVArcFLHDWEDTDITLDEIEAD 
VILVKLADRLHNMRTLKHLRKD^ 
MKEKRREREALVEAIVSKVKTYTTQQGL 

DVYAKVGY I HBLTOPMPGR FKDYI AAPKANGYQSI HTTVYGPKGPIB I QI RTKDMHQVAB YGVAAHWAYK 
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KGVRGKVNQABQAVOWWI KBLVBLQDASNGDAVDFVDS VXBDX PSBR i YVPTPTGAVQBLPKBSGPI DP 
AYAI HTQIGBKATGAKVNGRMVPLTAXLKTGDVVBI I TNANSFGPSRDWVKLVKTNKARNKI RQFFKNQD 
KBLSVNKGRDLLVSYPQEQGYVANKYLDKKRI EAI LPKVSVKSBESLYAAVGPGDISPI SVFNKLTBKBR 
RBBBRAKAKABABBLVKGGBVKHBNKDVLKVRSBNGVIIQGASG 
5 I AI HRSDCKNI KSQDGYQERLI EVBWDLDNSS KDYQAEI DI YGLNRSGLLNDVLQI LSNSTKS I STVNAQ ■ 
PTKDMKPAN IHVSPGI PNLTHLTTWBKI KAV PDVYS VKRTNG 

SEQ ED NO: 12 

ATGAGGAAOGAAATGGCAAAAATAATGAACGTAACAGGAGAAGAAGTCATTGC 
1 0 TGACCAAGGCTGATGTGGCTTTTGTGGCAAAGGCTTTAGCATAT^ 

GAGAAAGTCAGGCGAACCCTATATCGTCCATCCGATTCAGGTK 

GATGCTGTGACAGTTGCTTGTGGCTTTTTACATGATGTCOT 

TCGAAGCAGACTTTGGCCATGATGCTCGTGATATCGTTGATGGT^ 

CAAATCTCATGAGGAGCAACTCGCCGAAAACCATCGCAAAATGCTGATGGCT 
15 GTGATTTTGGTGAAATTtX3CTGACCGCCTGCATAATATGOGC^ 

AAGAGCGCATTTCGCGCGAAACCATGGAAATCTATGCCCCCTTGGCGCATC 

CAAAIX^SGAACTAGAAGATTTGGCTTTTCGTTACCTCAATGA 

ATGAAAGAAAAACGTCGCGAGCGTGAAGCTTTGGTAGAGGCTATTGTCACT 

CACAACAAGGGTTGTTTGGAGATGTGTATGGCCGACCAAAACACATCT 
20 GGACAAAAAGAAACGATTCGATCAGATTTTTGATCTGATTC 

GATGTCTATGCTATGGTTGGCTATATTCATGAGCTTO 

TTGCAGCTCCTAAAGCTAATGGCTACCAGTCTATTCATACCACCGTGTATGGGCCAAAAG 

GATTCAAATCAGAACTAAGGACATGCATCAAGTGGCTGAGTACGGGG 

AAAGGCGTGCGTGGTAAGGTCAATCAAGCTGAGCAAGCCGTTGG 
25 AATTGCAAGATGCCTCAAATGGCGATGCAGTGGACT^^ 

ACGGATTTATGTCTTTACACCGACAGGGGCCGTTCAGGAGTTACCAAAAG^ 

GCTTATGCX^TCCATACGCAAATCGGTGAAAAAGCAACAGGTGCCAAAGTCAATG 

TCACTGCCAAGTTAAAAACAGGAGATGTGGTTGAAATCATC^ 
' AGAC1XK3GTAAAACTGGTCAAMCCAATAAGGCTCGCAACAAAATTCGTCA 
30 AAGGAATTGTCAGTGAATAAAGGCCGTGATTTGTTGGTGTCT^ 

ATAAATACCTTGACAAAAAACGCATTGAAGCCATCCT 

CTATGCAGCCGTTGGGTTIK3GTGACATTAGTCCTATCAGTC 

CGTGAAGAAGAAAGGGCCAAGGCTAAAGCAGAAGCTGAAGAATTGGTTAA 

AAAACAAAGATGTGCTCAAGGTTCGCAGTGAAMTGGA^ 
35 GCGGATTGCCAAGTGTTGTiATCCTGTACCTGGTGATCCTA 

ATTGCGATTCACAGATCGGACTGTCATAACATTAA^ 

TCGAGTGGGATTTGGACAATTCGAGTAAAGATTATC^GGCTGAAATTGATATC 

TGGTCTGCTTAATGATGTGCTCCAAATTTTATCAAACTCAACCAAGAG 

CCGACCAAGGACATGAAGTTTGCTAATATTCACXSTGAGCTTTGGC^ 
40 CTGTTGTCX3AAAAAATCAAGGCAGTTCCAGATGTTTA 

Preferred GAS 389 proteins for use with the invention comprise an amino acid sequence: (a) having 

■ 

50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 11; and/or (b) which is a fragment of at least n 

45 consecutive amino acids of SEQ ID NO: 1 1, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, 200, 250 or more). These GAS 389 proteins include variants • 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, etc. ) of SEQ ID NO: 1 1 . Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 1 1 . Other preferred fragments lack one or 
more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one 

50 or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 

■ 

ID NO: 1 1 . Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 
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(7) CAS 504 

• GAS 504 corresponds to M 1 GenBank accession numbers Gl: 1 3622806 and GI: 1 5675600, to M3 
GenBank accession number GI: 2191 1061. to M18 GenBank accession number GI: 19746708, and is 
also referred to as 'Spy 1751' (Ml), ( SpyM3 J525\ 'SpyM18.1823' (Ml 8) and 'febK\ GAS 504 
5 has also been identified as a putative trans-2-enoyl- ACP reductase II . Amino acid and polynucleotide 
sequences of GAS S04 of an Ml strain ait set forth below: 

SEQ ID NO: 13 

MKTR I TBLLN I D Y P I FQGQ4AWVADGDLAGA VSNAGGLG 1 1 GGGNA P KB W KAN I DR VKA I TDR P FGVN I 
MLLSPFADDI VDLVI EEGVKVVTTGAGN PGKYMBRLHQAG 1 1 W PWPS VALAKRMB KLGVDAV I ABQ4E 
* 10 AGGHIQKLTTMSLVRQWEAVSIPVIAAGGIADGHGAAAAFMLGAEAVQIGTRFWAKBSN 

LAAKD I DTVI SAQWGH PVRS I KNKLTSA YAKAB KAPL I GQKTATDI EBMGAGS LRHAVI EGDWNGSVM 
AGQI AGLVRKEESCBTI LKDI YYGAARVI QNEAKRWQSVS I BK 

SEQ ID NO: 14 

. 1 5 ATGAAAACACGTATTACAGAATTACTTAATATTGATTACCCCATTT^ 

CTGATGGTGATTT AGCAGGTGCAGTTTCT AATG CTGGTGGTTT^ CAATGCTC C 

CAAAGAAGTCGTTAAAGCTAATATTGATCGTGTCAAAGCTATTACTGATAGACC^^ 

ATGCTTTTATCTCCTTTTGCTGATGATATCGTTGATCTGG 

CAGGCGCAGGAAATCCAGGAAAGTATATGGAAAGACTGCACCAGGCGGGTATAATC 
20 CCCAAGCGtTGCGCTAGCCAAACGTATGGAAAAGCITGGGGTA 

GCTGGAGGACATATTGGCAAGTTAACGACTATGTCTTTAGTAAG 

CTGTCATTGCGGCAGGrrGGTATAGCTGATGGTCA 

TGTTCAAATTGGAACTCGCTTTGTTGTTGCT 

TTAGCAGCAAAAGATATTGATACGGTGATTTCTGCGCAGGTTGTGG 
25 ATAAATTGACCTCAGCTTACGCTAAAGCAGAAAAAGCATTTTTM 

TGAAGAAATGGGAGCAGGATOTCTTCGACACGCTGTTATTGAAGGCGATO 

GCTGGCCAAATTGCAGGGCTTGTGAGAAAAGAAGAAAGCTC 

GTGCAGCTCGTGTTATTCAAAATGAAGCTAAGCGCTGGCAATCTGTT^ 

■ 

30 Preferred GAS 504 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1 3; and/or (b) which is a fragment of at least n 

« 

consecutive amino acids of SEQ ID NO: 13, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 504 proteins include variants (eg. allelic 
35 variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 13. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 13. Other preferred fragments lack one or more amino acids 
(e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1,2,3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 13. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 

« 

40 cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



(8) GAS 509 

GAS 509 corresponds to Ml GenBank accession numbers Gl:13622692 and GI: 15675496, to M3 
GenBank accession number GI: 21910899, to M18 GenBank accession number GI: 19746544, and is 
also referred to as *Spyl618' (Ml), 'Spy^.^' (M3), •SpyM18_1627' (M18) and 'cysM'. GAS 
45 509 has also been identified as a putative O-acetylserine lyase. Amino acid and polynucleotide 
sequences of GAS 509 of an Ml strain are set forth below: 

* ♦ 



* 
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SEQIDNO:15 

MTKIYKTITBLVGQTPIIKWRLIPH&AADVYVKU 

PTSGNTGIGUWVGAAKGYRVI I VM PBTMSLBRRQI I QAYGAE LVLT PGAEGMKGA I AKABTLA I BLGAW 
M PMQFNN PAN PS I HB KTTAQB I LEAF KB I S LDAFVSGVQT0GTLS6VSHVLKKANPBTVI YAVBAEB SAV 
S LSGQE PGPHKIQGISAGPI PHTLDTKAYDQI X RVKSKDALBTARLTGAKBG PLVGI SSGAALYAAI gVAK 
QLGKGKHVLTI LPDWGBRYLSTBLYDVPVI KTK 

SEQ ID NO: 16 

ATGACTAAAATTTACAAAACTATAACAGAATTAGTAGGTCAAACACCTArc 
1 0 TTCCAAACGAAGCTGCTGACGTTTATGTAAAATTAGAAGCT^ 
TATTGCTTTATCGATGATTGAAGCTGCTGAA 

CCAACAAGTGGTAATACAGGTATTGGTCTTGCATGGGTAGGTGCTGCTAAA 

TTATGCCCGAAACTATGAGCTTGGAAAGACGGCAAATCATTCAGG 

ACCTGGAGCAGAAGGTATGAAAGGGGCTATTGCAAAAGCTGAAACTTTAGCAATA 
1 5 ATGCCTATGCAATTTAATAACCCTGCCAATCCAAGCATCCATGAAAAAAC^ . 

AAGCTTTTAAGGAGATTTCTTTAGATGCATTCGTATCTGGTC 

TTCACATGTCTTGAAAAAAGCTAACCCTGAAACTGCT 

TTATCTGGTCAAGAGCCTGGACCACATAAAATTCAAGGTA 

ATACCAAAGCCTATGACCAAATTATCCGTGTTAAATCGAAAGATGCTTTAGA^ 
20 AGCTAAGGAAGGC TTCCTGGTTGGGATTTCTTCrGGAGCTGCTCTTTACGCCGCT 

CAgTTAGGAAAAGGCAAACAT 

TCTATGATGTACCAGTAATTAAGACGAAATAA 

Preferred GAS 509 proteins for use with the invention comprise an amino acid sequence: (a) having 
25 50% or more identity (e* 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1 5; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 15, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 509 proteins include variants (eg. allelic 
• variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ IP NO: 1 5. Preferred fragments of (b) 
30 comprise an epitope from SEQ ID NO: 15. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 15. For 
example, in one embodiment, the underlined amino acid sequence at the C-terrainus of SEQ ID NO: 

15 is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 

♦ . ■ 

35 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



40 



(9) GAS 366 
GAS 366 com 

GI.30315979, to M3 GenBank accession number GI: 21910712, to M18 GenBank accession number 
OI: 19746474, and is also referred to as 'SpylSZS 1 (Ml), «SpyM3J176' (M3), 'SpyM18_1542' 
(Ml 8) and *murD\ GAS 366 has also been identified as a UDP-N-acetylemuramoylalanine-D- 
glutamate ligase or a D-glutamic acid adding enzyme. Amino acid and polynucleotide sequences of 
GAS 366 of an Ml strain are set forth below: 



SEQ ID NO: 17 

MKVI SNFQNKKI LI LGLAKSGBAAAK LLTKLGALVTVNDS KPFDQNPAAQALLEEGI KVI CGSHPVELLD 
45 ENFEYMVKN PG I PYDN PMVKRALAKE I PI LTEVELAYFVS BAP I IGI TGSNGKTTTTTMI ADVLNAGGQS 
AIASGNIGYPASKWQKAIAGDTLVMBLSSFQLVGVNAFRPHIAVITNLM 

QAQMTBSDYLII4IANQBISATLAKTTKATVI PFSTQKVVDGAYLKDGI LYFKEQAJ IAATDI/3VPGSHNI 
ENALATI AVAKLSGI ADDI I AQCLSHFGGVKHRLQRVGQI KDI TF YNDSKSTNI LATQKALSGFDNSRLI 
* LI AGGLDRGNB FDDLVPDLLGLKQM I 1 IXSESAEIWKRAANKAEVSYLEARKVABATEL^KLAQTGDTI^ 
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< 

LSPANASWDMYPNPBVRGDBFLATFDCLRGDA 

SEQ ID NO: 18 

ATGAAAGTGATAACTAATTTTCA^ 
CAGCAAAATTATTGACCAAACTTGGTGCTTTAGTGACTGTT 
5 AGCGGCACAAGCCTTGTTGGAAGAGGGGATTAAGGTCATTTGTC 

GAGAACTTTGAGTACATGGTTAAAAACCCTGGGATTCCTTATGATW 
CAAAGGAAATTCCCATCTTGACTGAAGTAGAATTGGCTTATTTOT 

TACAGGATCAAACGGGAAGACAACCACAACGACAATGATTGCCGATC^^ . 
GCACTCTTATCTGGAAACATTGGTTATCCT^^ 
1 0 TGGTGATGGAATTGTCCTCniTCAATTAGTGGGAGTGAATGCTTTTC 
TAATTTAATGCCGACTCACCTGGACTATCATGGa 
CAAGCTCAGATGACAGAATCAGACTACCTTATTTTAAATGCTAATCAAG^ 
AGACCACCAAAGCAACAGTGATTCCTTITrCAACTCAAA 
AATACTCTATTTTAAAGAACAGGCGATTATAGCreCAACTGACTT 

1 5 GAAAATGCCCTAGCAACTATTGCAGTTGCCAAGTTATCTGG 

TTTCACATTTTGGAGGCGTTAAACATCGTTTGCAACGGGTTGGTCAAATC 
' TGACAGTAAGTCAACCAATATTTTAGCCACTCAAAAAGCT^ 
• TTGATTGCTGGCGGTCTAGATCGTGGCAATGAATTTGACGAT^ 
AGATGATTATTTTGGGAGMTCCGCAGAGCGTATGAAGCGA 
20 TGAAGCTAGAAATGTGGCAGAAGCAACAGAGCTTGCTTTTAAGCTGGC 
CTTAGCCCAGCCAATGCTAGCTGGGATATGTATCCTAATTT^ 
CCTTTGATTGTTTAAGAGGAGATGCCTAA 

Preferred GAS 366 proteins for use with the invention comprise ah amino acid sequence: (a) having 
25 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 1 7; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 17, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 366 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ED NO: 17. Preferred fragments of (b) 
30 comprise an epitope from SEQ ID NO: 17. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 17. For 
example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 
17 is removed. Other fragments omit one or more domains of the protein (eg. omission of a signal 
35 peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



(10) GAS 159 

GAS 159 corresponds to Ml GenBank accession numbers GI:13622244 and GI:15675088, to M3 
. GenBank accession number GI: 21910303, to M18 GenBank accession number GI: 19746056, and is 
also referred to as 4 Spyl 105* (Ml), 'SwM3_0767 r (M3), , SpyM18_1067 f (M18) and 4 potD\ GAS 
40 1 59 has also been identified as a putative spermidine/putrescine ABC transporter (a periplasmic 
transport protein). Amino acid and polynucleotide sequences of GAS 159 of an Ml strain are set 
forth below: 



SEQ ID NO: 19 

MRKLYSFLAGVLGVI VI LTSLS PI LQKRSGSGSQSDKLVI YNWGDYI DPALLKKPTKETGI BVQYETFDS . 
45 NEAMYTKI KQGGTTYDI AVPSDYTIDKMIKENLLNKLDKSKIiVGMDNIGKEP LGKSFDPQNDYSLPYFWG 
TVGI VYNDQLVDKAPMHWEDLWRPEYKNS I ML I IX»AREMLGVGLTTFGYS VNS KNLBQLQAAERKLQQLT 
PNVKAIVADEMKGYMIQGBAAIGITFSGEASBML 

FLNF I NR PENAAQNAAY IGYATPNKKAKALLPDE I KNDPAFY PTDDI I KKLEVYDNLGSR WLGI YNDLYL 
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45 



QPKMYRK 



SEQ ID NO: 20 

ATCOCTAAACTTfATTCCTTTCTAGCAGGAGTTT 
5 TCTTGCAGAAAAAATCGGGTTCTGGTAGTCAATCGGATAAA * 

TGATCCAGCTITCCTCAAAAAATTCACCAAAGAAACGGGCATTGM^ 

AATGAAGC.CATGTACACTAAAATCAAGCAGGGCGGAACCACTTACGACATT^ 

CCATTGATAAAATGATCAAAGAAAACCTACTCAATAAGCTTC 

TATOGGGAAAGAATTTTTAGGGAAAAGCTTTGACCCACAAAACGACT 
10 ACCGTTGGGATTGTTTATAATGATCAATTAGTTGATAAGGCGCCT 

CAGAATATAAAAATAGTATTATGCTGATTGATGGAGCGCGTGAAATGCT 

TGGTTATAGTGTGAATTCTAAAAATCTAGAGCAGTTGCAGGCAGCCGAGAGAAAA 

CCGMTGTTAAAGCCATTGTAGCAGATGAGATGAAAGGCTACATGATTC 

TTACCTTTTCTGGTGAAGCCAGTGAGATGTTAGATAGTAACGAACAC 
1 S AGGGTCTAACCTTTGGTTTGATAATTTGGT ACTACCAAAAACCATGAAACACGAAAAAGAAGCTTATGCT 

TTTTTOAACmATCAATCGTCtrGAA^ 

ATAAAAAAGCCAAGGCCTTACTTCCAGATGAGATAAAAAATGATCCTGCTTTT^ 

TATCAAAAAATTGQAAGTTTATGACAATTTAGGGTCAAGATW 

CAATTTAAAATGTATCGCAAATAA 

20 

Preferred GAS 159 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, $5%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 19; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 19, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18,20, 25, 

25 30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS 159 proteins include variants (eg. allelic 
variants, homolbgs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 19. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 19. Other preferred fragments lack one or more amino acids 
(e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 1, 8, 9, 10, 15, 20, 25 or mort) from the N-terminus of SEQ ID NO: 19. For 

30 example, in one embodiment, the underlined amino acid sequence at the N-terminus of SEQ ID NO: 
19 is removed. In another example, the underlined amino acid sequence at the C-terminus of SEQ ID 
NO: 19 is removed. Other fragments omit one or more domains of the protein (eg. omission of a 
signal peptide, of a cytoplasmic domain', of a transmembrane domain, or of an extracellular domain). 

(11) GAS 217 

35 GAS 217 corresponds to Ml GenBank accession numbers GI: 13622089 and GI: 15674945, to M3 





■ 






• 1 





also referred to as *Spy0925' (Ml), 'SpyM3_0638' (M3), and , SpyM18_0982* (M18). GAS 217 has 
also been identified as a putative oxidoreductase. Amino acid and polynucleotide sequences of GAS 
217 of an Ml strain are set forth below: 

40 SEQ m NO: 21 

MAQRI IVITGASGGLAQAI VKQLP^ 

QRYGR IDVLI NNAGYGAFKGFEEFS AQE I ADMFQVOTLAS I HFACLI GQKMAEQGQGHLIN I VSMAGLI A 

SAKSSIYSATKFALIGFSNALRLBLADKGVYVT^ 

RLVSI IGKNKRBLNLPFSLAVTHQFYTLFPKLSDYLARKVFNYK 



SEQIDNO:22 

* 

* 

ATGGCACAAAGAATCATTGTTATCACGGGAGCTTCTGGAG 
CCAAGGAAGACAGCTTGATTTTATTAGGACGTAACAA^ 

-13- 
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PATENT APPLICATION 
ATTY REP NO. PP20663.002 

CMCAAAGAATGCCTCGAGTTGGATATTACCAATCCAGTAGCCA 

CAGCGCTATGGCCGTATTGATG T CrT CA TTAATAATCCTCGCT 

TTTCT G CCCAAGAAATAGCTC^T ATOri TCAGGCT 

TGGTCAGAAAATGGCAGAGCAGGGGCAAOTTCACCTTATTAATATTGTGTCCA 
5 TCAGCCAAATCGAGCATTTATTCAGCCACCAAGTTTGCCCTTATCGGAT^ 
AATTAGCGGATAAAGGGGTTTACGTGACCACCGTGAATCCAGGTCC^ 
AGCTGACCCGTCTGGACATTATTTGGAAAGCGTTGGTAAATTC 
CGTTTGGTTTCTATTATCGGGAAAAATAAACGAGAATTGAATTO 
AATTTTACACCCTTTTCCCTAAATTATCTGATTATCTTC 

10 

Preferred GAS 2 1 7 proteins for use with the invention comprise an amino acid sequence: (a) having 
• 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%. 85%, 90%, 91%. 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 21; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 21, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
15 30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 217 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, eta) of SEQ ID NO: 21 . Preferred fragments of (b) 

■ 

comprise an epitope from SEQ ID NO: 2 1 . Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 21. 
20 Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(12) GAS 309 

m 

GAS 309 corresponds to Ml GenBank accession numbers GI: 13621426 and GI: 15674341, to M3 

■ * * * 

GenBank accession number GI: 21909633, to Ml 8 GenBank accession number GI: 19745363, and is 
25 also referred to as 'SpyOlM' (Ml), 'SpyND.tKWr (M3), 'SpyM18_0205' (M18), 'nra' and 'rofA*. 
GAS 309 has also been identified as a regulatory protein and a negative transcriptional regulator. 
Amino acid and polynucleotide sequences of GAS. 309 of an Ml strain are set forth below: 

SEQ ID NO: 23 

MIBKYLESSIBSKCQLIVLFFKTSYLPITEVAEKTC 
30 .TOPFKETYLyQLYASSNVLQLIAPLI KMGSHSRPLTDFARSHFLSNSSAYRMRBALI PLLRNFBLKLSKN 
KIVGBEYRIRYLIALLYSKFGIKVYDLTQQDKNTIHSFLSHSSTHLKTSPWLS 

QFSVTI PQTRI FOOLKKLFVYDSLKKS S HDI I ETYCQLNFS AGDLD YLYLI Y I TANNS FAS LQWTPBH I R 
QYCQLFEENDTFRLLLNPI ITLLPNLKEQKASLV^ PBTNLFVSPYYKGNQKLY 
TSLKLIVEEWMAKLPGKRDLNHKHPHLFCHYVEQS 
35 . IDFHSY YLLQDNVYQI PDLKPDLVI THSQLI PFVHHBLTKG I AVAE I SFDES I LS IQELMYQVKEEKFQA 
DLTKQLT 

SEQ ID NO: 24 

TTGATAGAAAAATACTTGGAATCATCAATCGAATCAAAAT 
40 CTTATTTGCCAATAACTGAGGTAGCAGAAAAAACTGGCTTA^^ 
GGAACTGAATGCCTTTTTCCXrrGGTAGTCTGT^ 
ACACATCCTTTTAAAGAAACTTATCTTTACCAACTCT 
TTTTAATAAAAAATGGTTCCCACTCTCGTCCCC 

CTCAGCTTATCGGATGCGCGAAGCATTGACT » 
45 AAGATTGTCGGTGAGGAATATCGCATCCGTTACCTCATCGCTCTGCTATATACT 

TTTATGACTTGACGCAGCAAGACAAAAAPVCTATTCATAGCTT^ 

AACCTCTCCTTGGTTATCGGAATCGTTTTCTTTCTATGA 

CAATTTTCXXSTAACTATTCCCCAAACCAGAATTT^ 

TlGAAAAAAAGTAGCCATGATATTATCGAAACTTACTGCCAACTAAAC 
50 CCTCTATTTAATTTATATCACCGCTAA 

.14. 



Latent application 
atty ref no. pp20663.002 

CAATATTGTCAAaTtTTGAAGAAAATGATAC^ 

CTAACCTAAAACACCAAAACCCTAGTTTAGTAAAAGCTCTTA 1*0 HTlTn'CAAAATCATTlTltjfTTAA 

TCTGCAACATTTTATTCCTGAGACCAACTTATT 

ACGTCCTTAAAGTTAATTCTCCAAGAGTGGATGCCCAAACTTCC^ 
5 AT TTT C AT CTr TTTT G CCACTATGTCGA<X!AA^ ITT 

CGTAGCCAGTAATTTTATCAATGCTCATCTCCTAACGGATTCTTn 

ATTGATTTTCATTCCTATTATCTATTGCAAGATAATGTTTATCAAA 

TCATCACTCACACn i CAACTGATTCCTTTT6TTCACCATG 

ATCTTTTGATGAATCGATTCTGTCTATCCAA 
10 GATTTAACCAAGCAATTAACATAA 

Preferred GAS 309 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQID NO: 23; and/or (b) which is a fragment of at least n 

1 S consecutive amino acids of SEQ ID NO: 23 , wherein n is 7 or more (eg. 8, 1 0, 1 2, 14, 1 6, 1 8, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, or more). These GAS 309 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 23. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 23. Other preferred fragments lack one or more amino acids 
(e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 

20 acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 23. 
Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(13) GAS 372 

GAS 372 corresponds to Ml GenBacJk accession numbers GI:13622698 and GI: 15675501, to M3 
25 GenBank accession number GI: 21910905, to M18 GenBank accession number Gl: 19746500 and is 
also referred to as < Spyl625 t (Ml), 'SpyM3 J369* (M3), and 4 SpyM18 J634* (M18). GAS 372 has 
also been identified as a putative protein kinase or a putative eukaryotic-type serine/threonine kinase. 
Amino acid and polynucleotide sequences of GAS 372 of an Ml strain are set forth below: 

SEQID NO: 25 

30 M I QI GKLFAGRYR I LKS IGRGQ4ADVYLANDLI LDNEDVAI KVLRTNYQTDQVAVARFQREARAMAELNH 
PNIVAIRDIGEEDGOQFLVMBYVIX^ 

NILLTKEGWKVTDrciAVAFAETSLT^ 

PYDGDSAVTI ALQHFQKPLPSI I EENHNVPQALENWI RATAKKLSDRY6STPEMSRDU1TALSYNRSRE 
RKI I FENVESTKPLPKVASGPTAS VKLS PPTPT\^TQESRLDOTNQTDALQPPTKKKKSGRFLGTLPKIL 
35 FSPPIVGVALFTYLILTKPTSVKVPNVAGTSLKVAKQELYDVGLKTO 

TAKRQGSS I TLYVSIGMKGFDMENYKGLDYQEAMNSLI ETYGVPKSKI KI ER I VTNE YPENTVI SQS PSA 
GDKFNPNGKSKI TLSVAVSDTITMPMVTEYSYADAVNTLTAIiGIDASRI KAYVPS SSSATGFVPI HS PSS 
KAIVSGQSPYYGTSLSLSDKGEISLYLYPEETHSSSSSSSSTSSSNSSSINDSTAPGSNTELSPSETTSQ 
TP 

40 

SEQ ID NO: 26 

ATGATTCAGATTGGCAAATTATTTGCTGGTCGTTATCGCAT^ 

CGGATGTTTATTTAGCAAATGACTTGATCTTGGATAATGAAG^ 

TTATCAAACAGATCAGGTAGCAGTTGCGCGTTTCCAACGAGAAGCGCGGGCC^ 
45 CCCAATATTGTTGCCATCCGGGATATAGGTGAAGAAGACGGACAGCAAT^ 

ATGGTGCTGACCTAAAGAGATACATTCAAAATCATGCTCCATTATCT 

GGAAGAAGTCCTTTCTGCTATGACTTTAGCCCACCAAAAAGGAATTGTACAC^ 

AATATCCTACTAACTAAGGAGGGTGTTGTCAAAGTAACTGA 

CAAGCTTGACAC^AACTAATTCGATGTTAGGCAGTGTTC^ 
50 CAAAGCGACGATTCAAAGTGATATTTATGCGATGGGGATTATGCT 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

ccttatgacggcgatagtgctgttacgattgccttgcaa 
aggagaaccacmtgtgccacaagc tit g 6 agaatcttgt^ 

TCGTTACXjGGTCAACCTTTGAAATGACTCG 

CGTAAGATTATCTTTGAGAATGTTGAAAGTACCAAACCCCTCCCCA 
5 CTGTAAAATTGTCTCCCCCTACCCCAACAGTGTTAACACAGGAAAGTTC 

AGATGCTTTACAGCCCCCCACCAAAAAGAAAAAAAGTGGTCGTrTTTTAGCT 

TTTTCTTTCTTTATTGTAGGTGTAGCACTCTTTACTTAT^ 

TTCCTAATGTAGCAGGCACTAGTCTTAAAGTTGCCAAACAAGAACTGTATGATC 

TAAAATCAGGCAAATTGAGAGTGATACGGTTGCTGAGGGAAATGTAGTT 
10 ACAGCTAAGAGGCAAGGCTCAAGCATTACGCTTTATGTGTC^ 

ACTACAAAGGACTAGATTATCAAGAAGCTATGAATAGTTTGATAGAAA 

AATCAAAATTGAGCGCATTGTAACTAATGAATATCCTGAAAATACAGTCATCAGTC 

GGTGATAAATTTAATCCAAACGGAAAGTCTAAAATTACGCTCAGTGTTGCTGTT 

TGCCTATGGTAACAGAATATAGTTATGCAGATGCAGTCAATACCTT^ 
1 5 TAGAATAAAAGCTTATGTGCCAAGCTCTAGCTCAGCAACGGGCTTTCTGC^ 

AAAGCTATTGTCAGTGGTCAATCTCCTTACTATGGAACGTCTTTGA 

GTCTTTACCTTTATCCAGAAGAAACACACTCTTCT 

TTCTTCAATAAATG ATAGTACTG CACCAGGTAGCAACACTGAATTAAGC CCATCAGAAACT ACTTCTCAA 
ACACCTTAA 

20 

Preferred GAS 372 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 25; and/or (b) which is a fragment of at least n 

■ 

consecutive amino acids of SEQ ID NO: 25, wherein n is 7 or more (eg. 8, 10i 12, 14, 16, 18, 20, 25, 
25 30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, 200, 250 or more). These GAS 372 proteins include variants 
(eg. allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 25. Preferred 
fragments of (b) comprise an epitope from SEQ ID NO: 25. Other preferred fragments lack one or 
more amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one 
or more amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ 
30 ID NO: 25. Other fragments omit one or more domains of the protein (eg. omission of a signal 
peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



(14) GAS 039 

GAS 039 corresponds to Ml GenBankaccessionnumbersGI:13621542andGI:15674446,toM3 
GenBank accession number GI: 21909730, to M18 GenBank accession number GI: 19745398 and is 
35 also referred to as 4 Spy0266» (Ml), *SpyM3 J>194' (M3), and 'SpyM18 J>250' (M18). Amino acid 
and polynucleotide sequences of GAS 039 of an Ml strain are set forth below: 

SEQ m NO: 27 

NTOLILFLLVLVLLGIX3AYLLFKVNG 
40 LYQQLTDI RDVLHRS LSDSRDRS DKRLEKI NQQ WQSLKNMQESNEKRLE KMRQI VEEKLEETLKNRLHA 
SFTDSVSKQLESWKGLGIMRSVAQDVGT1J4 

SERVEYAI KLPGNGQGGYI YLPIDSKFPLEDYYRLEDAYEVGDKIAIEASRKALLAAI KRFAKDIHKKYL 
NPPBTTNFGVMFLPTBC^YSEVVRNASFFDSLRRBE 

XI LQJVKLBFDKFGGLLAKAQKQMNTANNTLDQLI STRTNAI VRALNTVETYQDQATKSLLNMPLLEEEN 
43 MEM ' 

♦ ■ 

• m 

SEQ ID NO: 28 

ATGGACCTTATCTTGTTCCTTTTGGTCTTGGTTCT 
ACGGCCTTCAACATCAGCTTGCCCAAACCCTAGAAGGCAACGC^ 
50 CCAGTTGGATACAGCTAACAAACAACAATTC 

CTTTACCAACAATTAACAGATATTCGTGACGTCTTG^ 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 



ACGTTTGGAGAAMTGCGCCAGATCGTTGAAGAAAAATTGGAAGAto 
TCTTTCGATTCTOTATCCAAGCAACTAGAAAGTGTCM 

AACATGTGGGTACTTTAAATAAGGTTTTGTCCAATACCAAAACACG^ 
5 AGGCCAAATCATTGAGGATATCATGACATCAAGCCACTACG 

AGTGAACGCGTAGJUttATGCGATTAAGCTCCCAGGAAATGCT 

ACTCAAAATTC(XTCTTCAAGATTATTACCGATTAGAAGATC 

CGAGGCTAGCCGAAAAGCACTTCTGGCAGCTATCAAACGCTTTC 

AACCCCCCAGAGACGACCAATTTCGGAGTTATGTTCTTACCAA 
10 GAAATGCGTCTTT CT T TGA TAGCCTTCGTCGGGAAGAA 

TGCTTTGCTGAATTCCTTATCTGTTGGTTTCAAGACCCTTAAT^ 

AAAATTTTAGGCAATCTCAAGTTAGAATTCGATAAATTT^^ 

TGAATACAGCTAATAATACGCTGGATCAGCTCATTTCAACAAGGACAAATC 

TACCGTTGAAACTTATCAAGACCAAGCAACAAAATCTCTCTTGAACA 
IS AATGAAAATTAA 

• 

Preferred GAS 039 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 27; and/or (b) which is a fragment of at least n 

20 consecutive amino acids of SEQ ID NO: 27, wherein n is 7 or more. (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 039 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 27. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 27. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the Oterminus and/or one or more 

25 amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 27. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

« 

(15) GAS 042 

. GAS 042 corresponds to Ml GenBank accession numbers GI:13621559 and GI: 15674461, to M3 
30 GenBank accession number GI: 21909745, to M18 GenBank accession number GI: 19745415, and is 
also referred to as 'SpyOlST (Ml), *SpyM3_0209* (M3), and «SpyM 18.0275* (M18). Amino acid 
and polynucleotide sequences of GAS 042 of an Ml strain are set forth below: 

SEQ ID NO: 29 

MTKBKLVAFSQAHABPAWLQERRLAALEAI PNLELPTI ERVKFHRWNLGDGTLTENESIiASVPDFI AIGD 
35 NPKLVQVGTQTVLEQLPMALI DKGWFSD PYTALEB I PEVI BAHFGQALAPDEDKLAAYHTAYFNS AAVL • 
YVPDHLBITTPIEAIFLQDSDSDVPFNKHVLVIAGKESKF 

QI KFSAI DRLGPSVTTY I SIUlGIUiBKDAKIDWALAVMNEGNVI AOFDSDLIGQGSQADLICVVAASSGRQV 
QGI DTRVTNYGQRTVGH I LQHGVI LERGTLT FNGIGH I LKDAKGADAQQES RVLMLSDQARADAN P I LL»I 
DENEVTAGWAASIGQVDPEDMYYLMSRGLDQBTAERLVIRGPLGAVI AEI PI PSVRQEI IKVLDEKLLNR 

■40 

SEQ ID NO: 30 

ATGACAAAAGAAAAACTAGTGGCTTTTTCGCAAGCCCACGCT 

TAGCX3GCATTAGAAGCCATTCCAAATTTGGAATTACCAACCATCGAAA 

TCTAGGAGATGGTACCTTAACAGAAAATGAAAGTCTAGCTAGTGTTC 
45 AACCCAAAGCTTGTTCAGGTAGGCACXSCAAACAGTCTTAGAACA 

GAGTTGTTTTCAGTGATTTTTATACGGCGCTTGAGGAA^ 

GGCATTAGCTTTT^TGAAGAO^CTAGCTGCCTACCA 

TACGTTCCTGATCACTTGGAAATCACAACTCCTATTGAAG 

TTCCTTTTAAC^GCATGTTCTAGTGATTGCAGGAA^ 
50 ATCTATTGGCAATGCCACTCAAAAGATCAGCGCTAATATCAGTGTAGAA^ 

CAGATTAAATTCTCGGCTATCGACCGCTTAGGTCCTTCAG 

-17- 



PATENT APPLICATION 
ATTY REFNO. PP20663.002 

TAGAGAAGGATGCCAACATTGATTGGGCCTTAGCTGTGATGAATG 
CAGTQATTTOATTGCTCAGGGCTCACAAGCTC^ 
CAAGGTATTGACACGCGCGTGACCAACTATTCTCAACGTO 
.TTTTGGAACGTGGCACCTTAACGTTTAACGGGATTGGTCATATTCT 
5 TCAACAAGAAAGCCGTGTTTTGATGCTTTCTGA 

GATGAAAATOAAGTAACAGCAGGTCATGCAGCTTCTATCGGTCAGGTTGAC 
TGATGAGTCGAGGACTtX^TCAAGAAACAGCAGAACGATTGGTTATT 
CGCTGAAATTCCTATTCCATCAGTCCGCCAAGAGATTATTAAGGTTT^ 
TAA 

10 

Preferred GAS 042 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ 10 NO: 29; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 29, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
15 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 042 proteins include variants (eg. 

■ 

allelic variants, homologs, orthologs, paralogs, mutants, efe) of SEQ ID NO: 29. Preferred fragments 

■ 

of (b) comprise an epitope from SEQ ID NO: 29. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or.more) from the C-terminus and/or one or more 
amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 

■ 

20 NO: 29. Other fragments omit one of more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



(16) GAS 058 

GAS 058 corresponds to Ml GenBank accession numbers GI: 1 3621663 and GI: 15674556, to M3 
GenBank accession number GI: 21909841, to M18 GenBank accession number GI: 19745567 and is 
25 also referred to as 4 Spy0430< (Ml), 'SpyM3 J>305* (M3), and 'SpyMlSJMT (M18). Amino acid 
and polynucleotide sequences of GAS 058 of an Ml strain are set forth below: 

« 

SEQ ID NO; 31 

M KWSGFMKTKS KRFMJLATLCLAU 

GYLEGYEKGLKGDDI PERPKIQVPBDVQPSDHGDYRDGYEBGPGEGQHKROPLBTEABDDSQGGRQBGRO 
30 GHQEGADSSDLNVEESDGLSVIDBWGVI YQAFSTI WTYLSGLF 

SEQ m NO: 32 

ATGAAATGGAGTGGTTTTATGAAAACAAAATCAAAACGCTTTTTAAACCT 

TACTAGGAACAACTTTGCTAATGGCACATCCCGTACAGGCGGAGG 
35 TCGCTTCGGGTTAGGCG&TTTAGAAGATGATTCAGCTAA 

GGATATTTAGAGGGATATGAAAAAGGCTTAAAAGG^ 

CTGAGGATGTTCAGCCATCTGACC^TGGCGA 

ACATAAACGTGATCCATTAGAAACAGAAGCAGAAGATGATTCT 

GGACATCAAGAAGGAGCAGATTCTAGTGATTTGAACGTTGAAGAAAGCGA 
40 AAGTAGTTGGAGTAATTTATCAAGCATTTAGTACTAT^ 

Preferred GAS 058 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 3 1 ; and/or (b) which is a fragment of at least n 
45 consecutive amino acids of SEQ ID NO: 31, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, or more). These GAS 058 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 31. Preferred fragments 

•18- 



PATENT APPLICATION 
ATTY REF NO. PP20663.002 

* 

of (b) comprise an epitope from SEQ ID NO: 3 1 . Other preferred fragments lack one or more amino 
acids (e.g.. 1, 2, 3, 4, 5, 6 V 7, 8, 9, 10, IS, 20, 25 or more) from the G-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 3 1 . For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 31 is removed. Other fragments omit one or more domains of the protein (eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 

(17) GAS 290 

GAS 290 corresponds to M l GenBank accession numbers GI: 1 3622978 and GI: 15675757, to M3 







-Til 


• 



10 

also referred to as 'Spyl959' (Ml), 'SpyM3_1685' (M3), and 'SpyM18_2026* (M18). Amino acid 
and polynucleotide sequences of GAS 290 of an Ml strain are set forth below: 

* 

SEQ ID NO: 33 

MKHILPIVGSLREGSFhfflQLAAQAQKALBHQAWSYLNWKDVPVLKQDIBANAP 
15 WIFTPVYNPSIPGSVKNLLDWLSRAU)LSDPTGPSAIGGKVVTVSSVANG^ 
AGEPTKATVNPDAWGTGRLEI SKBTKANLLSQABALLAAI 

SEQ ID NO: 34 

ATGAAACATATTTTATTTATTGTTGGCTCGCTT 
20 GACAAAAAGCTCTGGAACATCAAGCAGTTGTAT 

AGATATCGAAGCTAATGCACCTTTACCAGTTGTTGACGCTCGTCAAGCTGTTC 

TGGATTTTTACACCAGTTTACAACTTCTCTATTCCAGGTT^ 

GTGCTCTTGATTTGTCTGATCCGACGGGCCCATCTGCTATTGGCGCT 

TGCAAATGGCGGGCATGATCAAGTATTTGATCAGTTTAA^ 
25' GCAGGAGAGTTTACAAAAGCAACTGTGAATCCTGATGCCTGGGGAACAGG 

AGACAAAAGCAAACTTGCTATCTCAGGCAGAGGCTCTTTT 

' Preferred GAS 290 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
30 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 33; and/or (b) which is a fragment of at least n 

w 

consecutive amino acids of SEQ ID NO: 33, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 290 proteins include variants (eg. allelic 
variants, homology orthologs, paralogs, mutants, etc.) of SEQ ID NO: 33. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 33. Other preferred fragments lack one or more amino acids 
35 (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 33. 
Other fragments omit one or more domains of the protein (e.g. omission of a signal peptide, of a 

cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

♦ . . 

(18) GAS 511 

40 GAS 51 1 corresponds to Ml GenBank accession numbers GI: 13622798 and GI: 15675592, to M3 
GenBank accession number GI: 2191 1053, to M18 GenBank accession number GI: 19746700 and is 
also referred to as 4 Spyl743* (Ml), *SpyM3_1517' (M3), 'SpyMloMSlS' (M18) and 'accA*. Amino 
acid and polynucleotide sequences of GAS 51 1 of an Ml strain are set forth below: 
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PATENT APPLICATION 
ATTY REFNO. PP20W3.002 

. • • 

SEQ D> NO: 35 

MTDVSRILKEARDQGRLTTU)YANLIFDDFMBLHGDRHPSDDGAIVGGLAYLAGQ 
NLARNFGQPNPEGYRKALRl>fKQA£KFGRPVVTFINTAGAYPGV^ 

A 1 1 1 GEGGS GGALALAVAOQVWMLENTMY AVLS P EG PAS I LWKOGS RATBAABLMKITAGB LY KMO I VDR 
S II PBHGYFSSE I VDI I KAN LI BQITSLQAKPLDQLLDERYQRFRKY 

SEQ ID NO: 36 

ATGACAGATGTATCAAGAATTTTAAAAGAAGCGCGTGATC^ . 

ACCTTATTTTCGATGACTTTATGGAACTGCATGG 
1 0 TGGCCTAGCTTATTItKSCGGGACAACCTGTTACGGTCAtTGGTAT^ 

AATTTGGCAAGGAATTTTGGCCAGCCCMTCCAGAAGGTTATCGTA 

CAGAAAAATTTGGACGACCAGTTGTTACGTTTATCMTACTGCAGGAGCCTATCCA 

AGAACGAGGACAGGGTGAGGCCATTGCTAAAAAtTTGATGGA^ 

GCCATCATTATTGGTGAAGGAGGCTCTGGTGGTGCATTAGCCCT 
1 5 TTGAAAATACTATGTATGCGGTTCTTAGCCCAGAAGGCTTTGCTTCT 

GGCGACCGAGGCCGCTGAATTGATGAAAATCACAGCGGGTGAACT 

ATTATTCCAGAACATGGTTATTTTTCAAGTGAAATCGTTGACATCA 

TAACCAGTTTGCAAGCTAAGCCATTAGACCAATTATTAGATGAGCGCT 

A . 

20 . 

Preferred GAS 51 1 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 35; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 35, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 1 8, 20, 25, 
25 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These GAS 51 1 proteins include variants (eg. allelic 
variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 35. Preferred fragments of (b) 
comprise an epitope from SEQ ID NO: 35. Other preferred fragments lack one or more amino acids 
' (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino 
acids(e.g. 1,2,3,4,5, 6,7,8,9, 10, 15,20,25 or more) from the N-terminus of SEQ ID NO: 35. 

* • 

. 30 Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, of a 
cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(19) GAS 533 

GAS 533 corresponds to Ml GenBank accession numbers GI: 13622912 and GI: 15675696, to M3 
GenBank accession number GI: 2191 1157, to Ml 8 GenBank accession number GI: 19746804 and is 
35 also referred to as c Spyl877* (Ml), *SpyM3 J621' (M3), 'SpyM18_1942* (M18) and 4 glnA\ GAS 
533 has also been identified as a putative glutamine synthetase. Amino acid and polynucleotide 
sequences of GAS 533 of an Ml strain are set forth below: 

> 

SEQ ID NO: 37 

. MAITVADIRREVKEKNVTFLRLMFTDIMGVMKNVEI PATKBQLDKVLSNKVMFDGSS I EGFVRINESDMY 
40 LYPDLDTWI VF PWGDENGAVAGLI CDI YTAEGKPFAGD PRGNLKRALKHMNE I GYKS FNLGPE PB F PLFK 
MDDKGN PTLEVNDNGGYFDLAPI DLADNTRREI VN I LTKMGFEVBASHHEVAVGQHEI DFKYADVLKACD 
NIQI FKLVVKTIAREHGLYATFMAKPKFGIAGSGMHCNMSLFDNQGNNAFYDEADKRGM^ 
..' GLMKHAYNYTAITNPTVNSYKRLVPGYEAPVYVAWAGSNRSPLIRVPASRGMGTRLEI^ 
LAVLLEAGLDGI I NKI EAPE PVBAN I YTMTMEERNBAG 1 1 DLPSTLHN ALKALQKDDWQKALGYH I YTN 
45 FLEAKRIEWSSYATFVSQWEIDHYIHNY 

• • ■ » 

SEQ ID NO: 38 

ATGGCAATAACAGTAGCTGACATTTCTCGTGAAGTCAAAGAAAAA/^ . . 

TCACTGATATCATGGGCGTTATGAAAAATGTGGAGATTCCTGCAACTAAAGAA 
50 GTCTAACAAGGTTATGTTTGATGGTTCATCTATCGAAGGTTTTCT 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

CTTTACCCCGATTTAGACACTTGGA'i'lXjrri' nrcCTGGGGAGATGAAAATCGAGCAQTTGCAGGTTTAA 
TTTOTGATATTTATACAGCAGAAGGAAAGCCTTTTGCAGGAGATCCT 
GAAACACATGAACGAGATCGGCTACAAATCATTTAATCTTGGACCAGAACC^ 
ATGGATGATAAAGGTAATCCGACACTTCAACT 
5 ACTTACCACACAACACGCOCCGTGAAAT 

TCATCATGAAGTGGCTGTTGGTCAAC^TGAGATTGATTTTAAATA 

AATATTCAAATTTTTAAGCTAGTTGTAAAAACGATTGCCC^ 

CTAAACCAAAATTTGGAATAGCTGGATCAGGGA 

TAATGCTTTTTATGATGAAGCTGATAAGCGAGGGATGCAGTTATCAGAAGA 
1 0 GGACTAATGAAGWTGCTTATMCTACACTGCTATCACTAACCCTA 
TTCCAGGTTATGAGGCACCTGTTTATGTCGCTTGGGCTGGAAGTAATC 
AGCATCACGTGGTATGGGAACGC<nT^ 

TTGGCrGTTCTCrTGOAAGCTGGACT^ . 
CTAACATTTATACCATGACAATGGAAGAACGAAATGAAGCAGGCAT^ 
15 TAATGCCTTAAAAGCTCTTCAAAAAGATGATGTGCTACAAAAGGCACT 
TTCTTAGAAGCAAAACGAATTGAATGGTCTTCCTATGCAACTT^ 
ATATTCATAATTATTAG 

« 

Preferred GAS 533 proteins for use with the invention comprise an amino acid sequence: (a) having 
20 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%. 85%, 90%. 91%, 92%, 93%. 94%, 95%. 96%, 
97%, 98%. 99%. 99^5% or more) to SEQ ID NO: 37; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 37, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

♦ 

30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 533 proteins include variants (eg. 
allelic variants, bomologs, orthologs, paralogy mutants, etc) of SEQ ID NO: 37. Preferred fragments 
25 of (b) comprise an epitope from SEQ ID NO: 37. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 37. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

30 (20) GAS 527 

GAS 527 corresponds to Ml GenBank accession numbers GL13622332, GI:15675169, and 
GI:242U764, to M3 GenBank accession number GI: 21910381, to Ml 8 GenBank accession number . 
GI: 19746136, and is also referred to as 'Spyl204* (Ml), 'SpyM3J>845' (M3), *SpyM18 J 155' 
(Ml 8) and 'guaA 9 . GAS 527 has also been identified as a putative GMP synthetase (glutamate 
35 hydrolyzing) (glutamate amidotransferase). Amino acid and polynucleotide sequences of GAS 527 of 
an Ml strain are set forth below: 

SEQ ED NO: 39 

MTBISILNDVQKIIVLDYGSQYNQLXAI&IREF^ 
AFGIDPEIFEIX5IPIIX3ICYGMQLITOKLGGKW 

40 HGDAVTB I PEGFHLVGDSNDC PYAAI ENTBKNLYGI Q FH PEVRHS VYGNDI LKNPAI S I CGARGDWSMDN 
F I DME IAKI RBTVGDRKVLLGLSGGVDS S WGVLIjQKAI GDQLTC I FVDHGLLRKDEGDQVMGMLGGKFG 
LOT IRVDASKRPLDLLADVEDPBKKRKI IGNEFVYVFDDEASKLKGVDFLAQGTLYTDI I BSGTETAQTI 
KSHHNVGGLPEDMQFELI B PLNTLFKDEVRALG I ALGMPEE I VWRQPFPGPGLAIRVMGAITEBKLETVR 
E S DA I LREB I AKAGLDRDVWQ YFTVNTGVRS VGVMGDGRTYD YT I AI RAI TS I DGMTADFAQL PWDVLKK 

45 I STRI VNBVDHVNRI VYDI TS KP PATVEWB 

* 

SEQ ID NO: 40 

ATGACTGAAATTTCAATTTT6AATGATGTTCAAAAAATTATCGTTCTTGATTATGGTAGCCAGTACAATC 
AGCTTATTGCTAGACGTATTCGAGAGTTTGGTGTTTTCTCCGAACTA 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

* * 

AGAACTTCGTGAGATCAATCCCATAGGTATCGT^ 
GCCTTTGGCATTC^CCCTGAAATCTTK^ 

TAATCACCCATAAATTAGGTGGTAAAGTTGTTCCTGCTGGACAAGCTGGTAAT 

AACCCTTCATCTTCGTGAAACGTCAAAArtATTTTCAGG 
5 CATGGTGATGCTGTTACTGAAATTCCAGAAGGTTTCCACCTTC 

CAGCTATTGAAAATACTGAGAAAAACCTTTACGGTATTCAGTTCCA 

TGGAAATGACATTCTTAAAAACTTTGCTATATCAATTT^ 

TTTATTGACATGGAAATTGCTAAAATTCGTGAAACTGT^ 

GTGGAGTTGATTCTTCAGTTGTTGGTGTTCTACTTCAAAA^ 
10 CGTTGATCACGOTCTTCTTCGTAAAGAC^ 

CTAAATATTATCCGTGTQGATGCTTCAAAACG T TrCT T AGACCTTC 

AAAAACGTAAAATTATTCGTAATGAATTTGTCTATGT^ 

TGACITCCTTGCCCAAGGAACACTTTATACTGA 

AAATCACATCACAATGTGGGTGGTCTCCCCGAAGACATGCAOT^ 
1 5 TTTTCAAAC^TGAAGTTCGAGCGCTTGGAAT 

ATTTCCAGGTCCTGGACTTGCTATC CGTGTCATGGGAGCAATTACTGAAGAAAAACTTGAAACCGTTCGC 

GAATCAGACGCTATCCTTCGTGAAGAAATTGCTAAGGCTGGA 

CAGTTAACACAGGTGTCCGTTCTGTAGGCGTCATGGGAGATGGTCOT 

TCGTGCTATTACGTCTATTGATGGTATGACAGCTGAC^ 
20 ATCTCAACACGTATCGTAAATGAAGTTGACX^CGTTAACCGTATCGTCT 

CCGCAACAGTTGAATGGGAATAA 

Preferred GAS 527 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

25 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 39; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 39, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 527 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 39. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 39. Other preferred fragments lack one or more amino 

30 acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the. N-tenninus of SEQ ID 
NO: 39. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). . 

■ * 

(21) GAS 294 

35 GAS 294 corresponds to Ml GehBank accession numbers Gl:13622306, GI:15675145, and 

GI:26006773, to M3 GenBank accession number GI: 21910357, to M18 GenBank accession number 
. GI: 1974611 1 and is also referred to as «Spyl 173* (Ml), 'SpyM3_0821 ' (M3), 'SpyM18_l 125' 
(Ml 8) and ' gid\ GAS 294 has also been identified as a putative glucose-inhibited division protein. 
Amino acid and polynucleotide sequences of GAS 294 of an Ml strain are set forth below: 

40 SEQ ID NO: 41 

MSQSTATYIITVIGAGLAGSEAAYQIAKRGIPVKLYEMRGVKATPQHKTTN 

LLKEEMRRLDS I IMRNGBANRVPAGGAMAVDREGYAE SVTAELENH PL I EV I RGE I TE I PDDAXTVIATG 
PLTSDALAE KI HALNGGDGFYF YDAAAP 1 1 DKST I DMSKVYLKSRYDKGEAAYLNC PMTKBEFMAFHRAL 
TTAEEAPLNAFEKEKYFEGCMP I EVhlAKRGI KTl^YG PMKPVGLE Y PDD YTGPRDGB FKT PYAVVQLRQD 
45 NAAGSLYNI VGFQTHLKWGEQKRVFQMI PGLENABFVRYGVMHRNSYMDSPNLLTBTFQSRSNPNIiFFAG 
QMTGVEGYVBSAASGLVAGINAARLFKREEALI FPQTTAIGSLPHYVTHADSKHFQPMNVNFGI I KELEG 
PRIRDKKKRYEAIASRALADLDTCLASL 

SEQ ID NO: 42 

50 TTGTCTCAATCAACTGCAACTTATATTAATGTTATTGGAGCTGTC 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

AGATTGCTAAGCGCGGTATCCCCOT 
AACCACTAATTTTGCCGAATTGGTCTGTTCCAACTCATTTW 
CTTCTCAAAGAAGAAATGCGGCGATTAGACTCCATTATTATGOT 
CTGOGGGAGCAATGGCTGTTGACCGTGAGGGGTATGC^ 
5 TCTCATTGAGGTC A TTC GT GGTGAAATTACAGAAATCCCIt^ 
CCGCTGACTTCGGATGCCCTGGCAGAAAAAATTCAC^ 

ATGCAGCAGCGCCTATCATTGATAAATCTACCATTGATATGAGCAAGGTTTACC^ 
TAAAGGCGAAGCTGCTTACCTCAACTGCCCTATGACC 
ACAACCGCAQAAGAAGCCCCGCTGAATGCCTTTGAAAAAGAAAAGT 
10 AAGTTATG6CTAAACGTG6CATTAAAACCATGCTTTATGGACCTA 

AGATGACTATACAGGTCCTCXXX^TGGAGAATTTAAAACGCCATATGCCGTC 

AATGCAGCTGGAAGCCTTTATAATATCGTTGOTTTCCAAA 

TTTTCCAAATGATTCCAGGGCTTGAAAATGCTGAGTTTGTC 

TATGGATTCACCAAATCTTTTAACCGAAACCTTCCAATCTC^ ■ 
1 5 ' CAGATQACTGGAGTTGAAGGTTATGTCGAATCAGCTGCTTCA 
GTTTGTTCAAAAGAGAAGAAGCACTTATTTTTCCTCA^ 
GACTCATGCCGACAGTAAGCATTTCCAACCAATGAACGTCA^ 
CCACGCATTCGTGACAAAAAAGAACGTTATGAAGCTATTGCTAGTCGTC 
GCTTAGCGTCGCTTTAA ' 

20 

Preferred GAS 294 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 41; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 41, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
25 30, 35, 40, 50, 60, 70, 80, 90, 1 00, 1 50, 200 or more). These GAS 294 proteins include variants (eg. 

■ * 

allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 41. Preferred fragments 

• * 

of (b) comprise an epitope from SEQ ID NO: 41 . Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-temunus and/or one or more 
amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 
30 NO: 41 . Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). . 



(22) GAS 253 

GAS 253 corresponds to Ml GenBank accession numbers GI: 1362261 1, GI: 15675423, and 
GI:21 362716, to M3 GenBank accession number GI: 2191071 1, to M18 GenBank accession number 
35 GI:1974(W73andisalsorefOTedtoas , Spyl524'(Ml),'SpyM3J175 , (M3), , SpyM18_154r 

* 

(Ml 8) and 4 murG\ GAS 253 has also been identified as a putative undecaprenyl-PP-MurNAc- 
pentapeptide-UDPGlcN Ac GlcN Ac transferase. Amino acid and polynucleotide sequences of GAS 
253 of an Ml strain are set forth below: 

SEQ ID NO: 43 

40 MPXKI LFTGGGTVGHVTLNLI LI PKF I KDGWEVHY IGDKNGI BHTB I E KSGLDVT PHAI ATGKLRR Y FSW 

QNLADV FKVALGLLQS LF I VAKLR PQALFS KGG FVSV PPWAAKLLGKPVF I HESDRS MGLANKI AYKF A 

TTMYTTFEQEDQLSKVKHLGAVTKWKDANQMPESTO 
HPELKQRYNIINTTGDPHLNBLSSHLYRVDYVTDLY 

GKEASRGDQLENATYFEKRGYAKQ1X)BPDL^ 

45 ADISSAIKEK 

« 

* . ■ 

SEQ ID NO: 44 

ATGCCTAAGAAGATTTTATTTACAGGTGGTGGAACTGTA 
CAAAATTTATCAAGGACGGTTGGGAAGTACATTATATTGGTGATAAAAATGGCA 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

TQAAAAGTCAQGCCTTGACGTt^CCTTTCATGCTATOGCG^ 

CAAAATCTAGCTCATCTTTTTAAQGTTGCACTTGGCCTCCTA 

GCCCTCAAGCC C TTTT TTC CAAAGCnX^TTTT GT C TC ACT 

TAAACCAGTCTTTATTCATGAATCAQATCGGTCAATGGGA 
5 ACTACCATCTATACCACTTTO 

AG GTTTTC AAAGATGCCAACCAAATGCCTGAATCAACTCAGTTAGAG 

AGACCTAAAAACCCTCTTGTTTATTGGTGGTTCGGCAGGGGCGCATG^ 

CATCCAGAATTGAAGCAACGTTATAATATCATCAATATTACAGQAQACCCTCA^ 

CTCATCTCTATCGAGTAGATTATK3TTACCGATCT 
10 GACAAGAGGGGGCTCTAATACACTTTTTGAGCTACTGGCAATGGCT 

GGTAAAGAAGCTAGCCXSTGGCGATCAGTTAGAAAATt^CAC^ 

AATTACAGGAACCTGATTTAACTTTGCATAATTTTG^ 

TGATTATGAGGCTACTATGTTGGCAACTAAGGAGATrCAGTCACCGGACTTC^ 
GCTGATATTAGCTCCGCGATTAAGGAGAAGTAA 

15 

Preferred GAS 253 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, ?3%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 43; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 43, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
20 30, 35, 40, 50, 60, 70, 80, 90, 100, 1 50, 200 or more). These GAS 253 proteins include variants (eg. 

* 

allelic variants, homologs, orthologs, paralogs, mutants, e/c) of SEQ ID NO: 43. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 43. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID 
25 NO: 43. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



(23) GAS 529 

GAS 529 corresponds to Ml GenBank accession numbers GI:13622403, GM5675233, and ' 
GI:21759132, to M3 GenBank accession number GI: 21910446, to M18 GenBank accession number 
30 GI: 19746203 and is also referred to as 4 Spyl280' (Ml); 'SpyM3J»l0' (M3), «SpyM18J228' 
(Ml 8) and ' glmS\ GAS 529 has also been identified as a putative L-glutamine-D-fructose-6- 

» 

phosphate aminotransferase (Glucosamine-6-phophate synthase). Amino acid and polynucleotide 
sequences of GAS 529 of an Ml strain are set forth below: 

* ■ 

SEQ ID NO: 45 

35 MCG I VGWGNRNATDI LMQGLE KLE YRGYDS AG I FVANANQTNM KSVGRI ADLRAKI GI DVAGSTGI GH 
' TRWATHGQSTEDNAHPHTSQTGRFVLVHNGVIENYIjHIKTEPLAGHDFKGQTDTE 
S VLEAFKKS LS 1 1 BGS YAF A1J4DSQATDT I YVAK^ 

ELVI LTKDKVTVTDYDGKELIRDS YTAELDLSDI GKGTYPFYMLKBIDEQPTVMRQLI STYADBTGNVQV 
DPAIITSIQEADRLYILAAGTSYHAGFATKNMLBQLTOT 
40 TADSRQVLVKANAMGI PSLTVTNVPGSTLSREATYTMLI HAG PE I AVASTKAYTAQI AA1AFLAKAVGEA 
NGKQEALDFNLVHBLS liVAQS I EATLSEKDLVABKVQALLATTRNAFYIGRGNDYYVAMBAALKLKEI SY 
IQCEGFAAGELKHGTISLIEEDTPVIALISSSQLVASHTRGKIQBVAARGAHVLTVVEEGLDREGDDIIV 
NKVHPFLAPIAMVI PTQLI AYYAS LQRGLDVDKPRNLAKAVTVE 

45 SEQ ID NO: 46 

ATGTGTGGAATTGTTGGAGTTGTTGGAAATCGCA^ 
TTGAATACCGGGGTTATGATTCAGCAGGAATTTTTC 
AGTGGGGCGGATTGCTGATTTGCGTGCCAAGATTGGCA 
ACCCGTTGGGCAACGCATGGCCAATCAACAGAGGATAATGCCC^T 
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PATENT APPLICATION 
ATTY REF NO. PP20663.002 

TTGTACTTGTTCATAATGGTGTGATTGAAAATTACCT 

TTTTAAGGGGCAGA<^GATACTGAC^TrGCACT^ 

TCAGTACTGGAAGCTTTTAAAAMTCTTTAAGCATT^ 

GCCAAGCAACTGATACTATTTATGTGGCTAAAAACAAGTCTCCATTC 
5 CAACATGGTTTCTTCAGATGCCATGGCCATGATTCGTGAA^ 

GAGCTAGTTATTTTAACCAAAGATAAGGTAACTGTTACAOACTACGATGGTAAAQA 

CCTACACTGCTGAATTAGACTTATCTGATATTGGCAAAGGGACT^ 

TGATGAGCAACCAACCGTAATGCGTCAATTAATTTCAACTTATC 

GATCCGGCTATCATTACCTCTATCCAAGAGGCTGACCGTCTT^ 
10 ATGCIWTTTTCCAACAAAAAATATGCTTG^ 

TOAGTGGGGTTACCACATGCCTCTGCTTAGCAAGAAACCAATGTTTACT 

ACCXKIAGATAGTCQTCAAQTTTrAGTAAAGGCAAATGCTATGGGCATTC CGAGTTTGACAGTAACTAACG 
TTCCAGGATCAACCTTATCACGTGAAGCAACATACACCATGTTGACT 
TCCGTCTACAAAAGCTTACACTGCACAAATTGCTG^ 
15 AATGCTAAGCAAGAAGCTCTTGACTTTAACTTGCTACATQAGTT^ 
CGACTTTGTCTGAAAAAGATCTCX3TGGCAGAAAAGGTTC^ 
TTACATCGGGCGTGGCAATGATTATTACGTTGCGATGGAAGCTC 
ATTCAATGCGAAGGCTTTGCGGCTQGTGAATTGAAACATGGAAC 
CAGTAATCCkrrTTAATATCGTCTAGTCAGTTGGTTGCCT 

20 tgcccgtggggctcatgttttaac^gttg^ 

AATAAGGTTCATCCTTTCCTAGCCCCGATTGCTATGCT« 
CATTACAACGTGGACTTGATGTTCATAAGCCACGTAATTT^ 

Preferred OAS 529 proteins for use with the invention comprise an amino acid sequence: (a) having 
25 50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%. 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 45; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 45, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

♦ 

30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 529 proteins include variants (eg. 
allelic variants, lomologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 45. Preferred fragments 
30 of (b) comprise an epitope from SEQ ID NO: 45. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15 f 20, 25 or more) from the C4erminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 45. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 

• * 

• of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 



35 (24) GAS 045 

* 

GAS 117 corresponds to M3 GenBank accession number GI:. 21909751 , M18 GenBank accesion 
number GI: 19745421 and is referred to as *SpyM3 _021 5' (M3), *SpyM18_oppA' (M18) and *oppA\ 
GAS 045 has been identified as an oligopeptide permease. Amino acid and polynucleotide sequences 
of GAS 045 from an Ml strain are set forth below: 

40 SEQIDNO:47 

VTFMKKSKWLAAVSVAI LSVS AIAA CGNKNASGGSBATKTYKYVPVNDPKSLDYI LTNGG 
GTTDVI TQMVDGLLENDBYGNLVPSLAKDW 

TABDFVTGLKHAVDDKSDALYWEDSI KNLKAYQNGBVDFKBVGVKAI^DKTVQYTLKKP 
BSYVWSKTTYSVLPPVNAKFLKSKGKDFGTTDPSSILVNGAYFLSAFTSKSSMEPHKNBN 

45 YWDAKNVGI ESVKLTYSDGSDPGSFYKNFDKGEFSVARLYPNDPTYKSAKKNYADNITYG 
MLTGDI RHLTWNLNRTS F KNTKKDPAQQDAGKXALNN KD FRQAI Q FAFDRAS FQAQTAGQ 
DAKTKAUlNmiVPPTFVTIGESDFGSBVEKEMAKLm • 
FAKAKEALTAEGVTFPVQLDYPVDQANAATVQBAQSFKQSVEASLGKBNVIV^ 
THBAQGFYAETPBQQDYDI I SS WWGPDYQDPRTYLDIMS PVGGGS VIQKLGI KAGQNKDV 

50 VAAAGLDTYQTLLDEAAAITDDNDARYKAYAKAQAYLTDKAVDI PWALGGT PRVTKAVP 
PSGGFSWAGSKGPLAYKGMKLQDKPVTVKQYEKAKEKV1W 
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SEQIDNO:48 

GTGACTTTTATGAAGAAAACTAAATGGTTGGCAGCTGTAAg 
* TCCGCTTTQGCAGCTT GTGGTAATAAAAATGCTTCAGGTG 
5 TACAACTA CGTTTT T CT T AACQATCCAAAATCATTGGATT AT ATTTTQACTAATGGCOOT 

GGAACGACTGATGTGATAACACAAATOGTTGATGGTCTTTT^ 

AATTTAGTACC^TCACTTGCTAAAGATTGGAAGGTTTCA^ 

TATACTCTTCGCGATGGTGTCTCTTGGTA 

ACAGCAGMOATTTTGTGACTGGTTTCAAGCACGCOGTTC^ 
10 TACGTTGTTGAAGATTCAATAAAAAACTTAAAGGCTTACCAAAATOGTQAAGT^ . 

AAAGAAGTTG(?lXn , CAAAGCCCTTGACGATAAAACTC 

GAAAGCTACTGGAATTCAAAAACAACTTATAGTGTGCTTTTCCCAGCT 

TTGAAGTCAAAAGGTAAAGATTTTGGTACAACCGATCCATCATCAAT^ 

GCTTACTTCTTC^CGCCTTCAC 
15 TACTGGGATGCTAAGAATGTTGGGATAGAATCTGTTAAATTGACTTA 

GACCCAGGTTCGTTCTAGAAGAACTTTGACAAGGGTGAGTTCAGC^ 

CCAAATGACCCTACCTACAAATCAGCTAAGAAAAACTATGCTGATAAC 

ATGTTGACTGGAGATATCCGTCATTTAACATGGAATTTG^ 

ACTAAGAAAGACCCTGCACMCAAGATGCCGGTAAGAAAGCTCTTAACAAC^ • 
. 20 CGTCAAGCTATTCAGTTTGCTTTTGACCGAGCGTCATTCC^ 
GA^CAAAACAAAAGCCTTACGTAAOlTqCTTGTCCa 
GAAAGTGATTTTGGTTCAGAAGTTGAAAAGGAAATGGCAAAACTTGGTC 
GACX5TTAACTTAGCTGATCCTCAAGATGGTCT 

TTTGCAAAAGCCAAAGAAGCTTTAACAGCTGAAGGTGTAACC1T * 
25 TACCCTGTTGACCAAGCAAACGCAGCAACTGTTCAGGAAGCCCACrrCTTl^ 
GTTGAAGCATCTCTTGGTAAAGAGAATGTCATTGTCAATG^ 
ACTCACGAAGCCCAAGGCTTCTATGCTGAGACCCCAGAAC^ 

TCATCATGGTGGGGACCAGACTATCAAGATCCACGGACCTACCTTGACATCATG^ . 
' GTAGGTGGTGGATCTGTTATCCAAAAACTTGGAATCAAAG^ 
30 . GTGGCAGCTGCAGGCCTTGATACCTACCAAACTCTTCTTGATC 

GACGACAACGATGCGCGCTATAAAGCTTACGCAAAAGCACAAGCCTACCTT^ 

GCCGTAGATATTCCAGTTGTGGCATTGGGTGGCACTCCACGAGTTACTAAAG^CXv 

TTTAGCGGGGGCTTCTCTTGGGCAGGGTCTAAAGGTCCTCTO 

CTTCAAGACAAACCTGTCACAGTAAAACAATACGAAAAAGCAAAAGAAA^ 
35 GCAAAGGCTAAGTCAAATGCAAAATATGCTGAGAAGTTAGCTGATCACGTTC 

Preferred GAS 045 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg 60% 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 47; and/or (b) which is a fragment of at least n 

40 consecutive amino acids of SEQ ID NO: 47, wherein n is 7 or more (eg 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60; 70, 80, 90, 100, 150, 200 or more). These GAS 045 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, eie) of SEQ ID NO: 47. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 47. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 

45 amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the N-tenninus of SEQ ID 
NO: 47. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 47 is removed. Other fragments omit one or more domains of die protein (eg. omission 
of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 

50 (25) GAS 095 

GAS 095 corresponds to Ml GenBank accession numbers GI: 13622787 and GI: 15675582, to M3 
GenBank accession number GI: 2191 1042, to M18 GenBank accession number GI: 19746634 and is 
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also referred to as , Spyl733 t (Ml), 'SpyM3 J 506 f (M3), € SpyM18J74r (M18). GAS 095 has also 
been identified as a putative transcription regulator. Amino acid and polynucleotide sequences of 
GAS 095 of an Ml strain are set forth below: 

SEQ ID NO: 49 

5 MKIGKK I VUjPTA I VLTTVLALGVYLTS AYTPST GBLSKTFKDPSTSSNKSDA I KQTRAFS I LIUGVDTG 
SSBRASKWBGNSDSMILVTVNPKTKKTTMTSLBRDTLTTl^GPKNNBMNG 

VQDLLN I TI DNYVQINMQGLI DLVNAVGGITVTNEFDFPISI ABNBPBYQATVAPGTHKINGBQA1#VYAR 
MRYDDPEGDYGRQKRQRBVIQKVLKKI tALDSI SSYRKI LSAVSSNMQTNI BI SSRTI PSLIX3YRDALRT 
IKTYQLKGBDATLSDGGSYQIVTSNHUiEIQNRIRTBl/3UiKVNQLKTNATVYBN 
10 SSGQAPSYSDSHSSYAOTSSGVDTGQSASTDQDS^ 
NPQT 

• » 

SEQ ID NO: SO 

ATGAAAATTGGAAAAAAAATAGTTTTAATGTTC^^ 
15 TCTATCTAACTAGTGCTTATACCTTCTCAA CAGGAGAATTATCAM 

TTCAAACAAAAGTGATGCCATTAAACAAACAAGAGCTTITrCTA 
TCTTCAGAGCGTGCCTCCAAGTGGGAAGGAAACAGTGATTCGAT^ 
CCAAGAAAACAACTATGACTAGTTTAGAACGAGATACCTTAACCACGTTATCTGGACC 
AATGAATGGTGTTGAAGCTAAGCTTAACGCTGCTrATC 
20 GTGCAAGAT CT T T TGAATATCACCATTQATAACTATGTTCAAATTAATATGCAA 
TGAATGCAGTTGGAGGGATTACAGTTACAAATGAGTTT^^ 
TGAATATCAAGCTACTGTTGCGCCTGGAAC^CACAAAATO 

ATGOTTTATGATGATCCTGAGC^ 

TGAAAAAAATCCTTGCTCTTGATAGCATTAGCTC1TATCGGAAGATTW 
25 GCAAACGAATATCGAAATCTCTTCTCGCACTATCCCTAGTCTATTAGGTTAT^ 

ATTAAGACTTATCAACTAAAAGGAGAAGATGCCACTTTATCAGATGGTGGAT^ 

CTAATCATTTGTTAGAAATCCAAAATCGTATCCGAACAGAATTAGGACIT 

AACAAATGCTACTGTTTATGAAAATTTGTATGGGTCAA 

TCTTCAGGCCAGGCTCCATOTATTCTGATAGT^ 
30 CCGGCCAGAGTGCTAGTACAGACCAGGACTCTACTGCTT^ 

ATCAGATGCTTTAGCAGCTGATGAGTCTAGCTCATCAGGGTC 

AACCCTCAGACCTAA 

■ 

Preferred GAS 095 proteins for use with the invention comprise an amino acid sequence: (a) having 
35 50% or more identity (e.g. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 

■ 

97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 49; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 49, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). Tbese GAS 095 proteins include variants (eg. 
allelic variants, homologs, ortHologs, paralogs, mutants, etc) of SEQ ID.NO: 49. Preferred fragments 

40 of (b) comprise an epitope from SEQ ID NO: 49. Other preferred fragments lack one or more amino 
acids (eg. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (eg. 1, 2, 3> 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 49. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 49 is removed. Other fragments omit one or more domains of the protein (eg. omission 

45 of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular 
domain). 



(26) GAS 193 

GAS 193 corresponds to Ml GenBank accession numbers GI: 13623029 and GI:15675802, to M3 
GenBank accession number GI: 2191 1267, to M18 GenBank accession number GI: 19746914 and is 

■ * 
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also referred to as •Spy2025' (Ml), *SpyM3 .173r (M3). 'SpyMISJtfte' (M18) and *isp\ GAS 193 
has also been identified as ah immunogenic secreted protein precursor. Amino acid and 
polynucleotide sequences of GAS 193 of an Ml strain are set forth below: 

SEQIDNO:Sl 

5 MKKRKLIJVVnXSTILLNSAVPLW 

KDHKPSsfTHPTPPSNDTKQTDQASSBATDKPNKDKNDTKQPDS 

PDQQKDQTPDKTPBKSADKTPBKGPEKATDKT^^ 

RSSAAYVRHMTGDSAYTHNLLSWtYGITABQL^ 

AMABSSLCTQGVAKBKGAhWFGYGAFTDFNPNNAKKYSDBVAI RHMVBDTI IANKNQTPERQDLKAKKWSL 

10 <^U>TLIDGGVYFTDTSGSGQRRADIM^ 

LTYKS BT Y S FGQCTWYAYNR VKB LG YQVDR YMGNGGDWQRKPGFVTTHKPKVG YVVS PAPGQAGAD ATYG 
HVAWEQI KEDGSILI SESNVMGLGTI SYRTPTABQASLLTYWGDKLPRP 

SEQIDNO:52 

15 ATGAAGAA^GGAAATTGTTAGCAGTAACACTATTAAGTACCATACTCTTAAAC^GTGCAG 
TTGTTGCTGATACCTCCTTGCGTAATAGCACATCATCC^ 

GGATGACGAGAGTGAAACACCAAAAAAAGACAAAAAAAGCAAGGAAACAGCGTCGC^ 
AAAGACCATAAGCCATCACACACTCACCCAACCCCCCCTTCAAATGATACTAAGCAGACCGAT^ \ 
. CATCTGAAGCTACTGACAAACCAAATAAAGACAAAAACGACACCA 

20 .CACCCCATCTCCCAAAGACttGTCGTCTCAAA^ 

CCTGATCAGCAAAAAGATCAGACACCTGATAAAACACC AGAAAAATCAGCTGATAAAAC CCCTGAAAAAG 

GACCAGAAAAAGCAACTGATAAAA^ 

AGCAGCTGCTCCTGTCTTTATACCTTGGAGAGAAAGTGACAAAGACCT^ 
CGCTCATCAGCGGCTTACGTGAGACACTCGACAGGTGACTCTCCCT 
25 GTTATGGGATTACTGCTGAACAGCTAGATGGTTTTTTGAA^ 
CTTAAACGGAAAGCGTTTATTAG^ 

GCAATGGCAGAAAGCTCACTAGGTACTCAGGGACrrTGCTAAAG 

GCGCCTTTGACTTCAACCCAAACAATGCCAAAAAATACAGCGATGAGG 

AGACACCATC^TO^C^CAAAAACCAAACCTTTGAAAGACAA 
30 GGCCAGTTGGATACCTTGATTGATGGTGGGGTTTACTT^ 

CAGATATCATGACCAAACTAGACCAATGGATAGATGATCATGGAAGCACACC^ 

CAAGATAACTTCCGGGACAC^TTTAGCGAAGTGCCCGTAGGTTATAAAAG^ 

TTGACCTACAAGTCAGAGACCTACAGCTTTGGCCyUlTGCACTTGGTACGC 

TAGGTTATCAAGTCGACAGGTACATGGGTAACGGTGGCGA 
35 CCATAAACCTAAAGTGGGCTATGTCGTCTCATTTGCACCAG^ 

CACGTTGCTGTTGTAGAGCAAATCAAAGAAGATTC 

TAGGCACC^TTTCCTATCGGACGTTCACAGCTGAGCAGGCT 

ACTCCCAAGACCATAA 

■ 

40 Preferred GAS 193 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%. 70%, 75%, 80%, 85%. 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 51; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 51, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 

* 

30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 193 proteins include variants (e.g. 

45 allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 51 . Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 5 1 . Other preferred fragments lack one or more amino 
acids (ag. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 of more) from the N-terminus of SEQ ID 
NO: 51. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 

50 of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 
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(27) GAS 137 

GAS 137 corresponds to Ml GenBank accession numbers GI:13621842,GI:15674720 and 
Gl:30173478, to M3 GenBank accession number GI:21909998, to Ml 8 GenBank accession number 

■ 

GI: I974S749 and is also referred to as , Spy06S2* (Ml), 'SpyM3_0462\ and 'SpyM18_0713' (M18). 
$ Amino acid and polynucleotide sequences of GAS 1 37 of an M 1 strain are set forth below: 

■ 

SEQ ID NO: S3 

MSDKHINLVIVTGMSGAGKTVAIQSFBD1XSYCT 

KBINSTLDS I ESN PS I DPRI LP LDATDGE LVSR YKBTRRSH PLAADGRVLDGI RLBRELLS PLKSMSQHV 
VWTKLTPRQLRKTISDQFSEGSNQASFRIEVMSFG^ 
1 0 BDVFNYVMS H PBS EVFYKHLLNLI V P I LPAYQKEGKS VLTVAIGCTGGQHRS VAFAHCLABSLATDW S VN 
BSHRDQNRRKBTVNRS 

SEQ m NO: 54 

ATGTCAGACAAACACATTAATTTAGTTATTGTGACAGGAATO 
1 5 AGTCTTTTGAGGATCTAGGCTACTTTACCATTGATAA^ 

ATTAATTGAACAAACCAATGAAAATCGTAGGGTGGCTTTGGTTGT 

AAGGAAATTAATTCTACCTTAGATAGTATTGAAAGCAATCCT 

ATGCAACGGATGGAGAATTGGTGTCACGCTATAAAGAAACCAGACGGAGC 

TCGTGTGCTTGATGGTATTCGATTQGAAAGAGAACTC 
20 GTGGATACAACAAAATTGACCCCTAGACAATTGCGTAAAACCATT^ 

ATCAAGCCTCTTTCCGTATTGAAGTGA'TGAGCTTTGC^ 

GGTTTTTGATGTGOGTTTTCrACCCAATCCW 

GAGGACGTTTTTAATTATGTGATGTCTCACCCAGAATCAGAGGTC 

TTGTCCCTATCTTACCXSGCTTACCAAAAAGAAGGGAAGTCT^ 
25 AGGCCAACACCGCAGOGTTGC CT T TC CCCATTGCrTGGCAG 

GAAAGCCATCGTGATCAAAATCCTCGTAAGGAAACGGTGAATCGTTC 

Preferred GAS 1 37 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
30 97%, 98%, 99%, 99.5% or more) to SEQ ID NO: S3; and/or (b) which is a fragment of at least n 

consecutive amino acids of SEQ ID NO: 53, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 137 proteins include variants (eg. 

• • . • 

allelic variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 53. Preferred fragments 

■ 

of (b) comprise an epitope from SEQ ID NO: 53. Other preferred fragments lack one or more amino 
35 acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C^erminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-tenninus of SEQ ID 
NO: 53. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

* 

(28) GAS 084 

40 GAS 084 corresponds to Ml GenBank accession numbers GI: 1 3622398 and GI: 1 5675229, to M3 
GenBank accession number GI: 21910442, to M18 GenBank accession number GI: 19746199 and is 
also referred to as 'Spyl274' (Ml), 4 SpyM3 J)906* and 4 SpyM18 _1223' (Ml 8). GAS 084 has also 
been identified as a putative amino acid ABC transporter/periplasrnic amino acid binding protein. 
Amino acid and polynucleotide sequences of GAS 084 of an Ml strain are set forth below: 

45 SEQ ID NO: 55 

M IIKKRTVAIlAIASSFPLVA OQATKSLKSGDAWGVYOKQKSITVGPDNTPyPMGYKDBSGRCKGPDIDL 
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AKBVraQYGLKVNFQAINWDMKEABI^OKIDVIWNGYSITKERQ 

TI SDMKHKVLGAQSASSGYDSLLRTPKLLKDP I KNKDANQYBTFTQAFI DLKSDRI DGI LI DKVYANY YL 
AKBGQLBN YRMI PTTPBMBAPSVGLRKEDKTLQAXI NRAPRVLYQNGKFQAI SBKWPGDDVATANI KS 

5 SEQIDNO:S5 

ATWTTATAAAAAAAAGAACCGTAGCAATTTTAGCCATAGCTAGTAGCTlTrrC 

CTACTAAAAGTCTTAAATCAGGAGATGCTTGGGGAGTTTACC 

TGAC^TACGTTTGTTCCTATGGGCTATAAGGATGAAA 

GCTAAAGAAGTTTTTCACCAATATGGACTCAAGGTTAACTTTCAAGCT 
1 0 C^GAACTAAACAATGGTAAAATTGATGTAATCTGGAATGGTTATT^ 

GGTTGCCTTTACTGATTCTTACATGAGAAATGAACAAATTATTC 

ACAATATCAGATATGAAACATAAAGTGTTAGGAGCACAATCAGC^ 

G^CTCCTAAACTGCTGAAAGATTTTATTAAAAATAAAGACGCTAATCAATATG 

TTTTATTGATTTAAAATCAGATCGTATCGATGQAATATTGATTGACAAACT 
15 GCAAAAGAAGGGCAATTAGAGAATTATCGGATGATCCCAACGACCTTTGAAM 

GACTTAGAAAAGAAGACAAAACGTTGCAAGCAAAAATTAATCGTGCTTTCAGGGTC 

CAAATTTCAAGCTATTTCTGAGAAATGGTTTGGAGATGATGTTC 

■ 

Preferred GAS 084 proteins for use with the invention comprise an amino acid sequence: (a) having 
20 50% or more identity (e«. 60% f 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 55; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 55, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 084 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, etc) of SEQ ID NO: 55. Preferred fragments 
25. of (b) comprise an epitope from SEQ ID NO: 55. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 1 0, 1 5, 20, 25 or more) from the C-terminus and/or one or more 

* 

amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 
NO: 55. For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 
SEQ ID NO: 55 is removed ther fragments omit one or more domains of the protein (eg. omission of 
30 a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

* 

(29) GAS 384 

GAS 384 corresponds to Ml GenBank accession numbers GI: 1 3622908 and GI: 1 5675693, to M3 
GenBank accession number GI: 2191 1 154, to M18 GenBank accession number GI: 19746801 and is 
also referred to as 'Spyl874* (Ml), 'SpyM3_1618' (M3), and 'SpyMtt.m^' (M18). GAS 384 has 
35 also been identified as a putative glycoprotein endopeptidase. Amino acid and polynucleotide 
sequences of GAS 384 of an Ml strain are set forth below: 

SEQ ID NO: 57 

MKTIAFDTSNKTLSLAILDDETLLAD 

GLRVAVATAKTLAYSLNI ALVGI SSLYALAASTCKQY PNTLWPJjI DARRQNAYVGYYRQGKSVMPQAHA 
40 S LEVI I EQLVBEGQLI FVGBTAPFABKIQKKLPQAI LLPTLPSAYBCGLLGQSLAPENVDAFVPQYLKRV 
SAEENWLKDNB I KDDS HYVKRI 

SEQ ID NO: 58 

ATGAAGACACTTGCATTTGATACCTCAAATAAAACCTTG^ 
45 TAGCAGATATGACCCTTAACATTCAGAAAAAACATAGTGTTAK 
GACTTGTACKSATCTTAAACCTCAAGAm 
GGTTTACGAGTGGCAGTTGCTACTGCAAAAACGTTAGCGTAC^ 
(X»GTCTATATGCTTTGGCTGCGTCT^ 

TGCTAGAAGGCAAAATGCGTATGTAGGTTATTATCGGCAAGGAAAATCAGTGA 

* 
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TCACTAGAAGTTATTATAGAACAATTAGTAGAAGA^ 
TTGCTGAGAAAATTCAAAACAAACTACCTCAGCOQATACTACT^ 
TGCTCTTTTGGOGCAAAGTTTGGCACCAGAAAATGTAGACGCCT^ 
GAAGCTGAAGAAAACTGGCTCAAAGATAATGAGATAAAAGATGATAGT^ 

5 

Preferred GAS 384 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99,5% or more) to SEQ ID NO: 57; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 57, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

10 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 384 proteins include variants (eg. 
allelic variants, homology ortholop, paralogs, mutants, etc) of SEQ ID NO: 57. Preferred fragments 
of (b) comprise an epitope from SEQ ID NO: 57. Other preferred fragments lack one or more amino 
acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID 

15 NO: 57. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

■ 

(30) GAS 202 

GAS 202 corresponds to Ml GenBank accession numbers GI: 13622431 and GI: 1567525 8, to M3 
GenBank accession number GI: 21910527, to M18 GenBank accession number GI: 19746290 and is 
20 also referred to as , Spyl309 f (Ml), 'SvyW JMY (M3), 'SpyM18J321' (M18) and 'dltD'. GAS 

* 

202 has also been identified as a putative extramembranal protein. Amino acid and polynucleotide 

» 

sequences of GAS 202 of an Ml strain are set forth below: 
SEQ A) NO: 59 

MLKRLWLILGPLLIAFVLWITIFSFPTQ^ 
25 GSSEWSRMDSMHPSVLABKYKRSYRPFLIGKRGSASI£ 

PSAVQMYLSNTQVI EFLLKARTDKBSQFAAKRLLELNPGVS KSNLLKKVSKGKSJjSRLDRA I LKCQHQVA 

LRBESLFSFI/3KSTNYBKRILPRVKGLPKVFSYTO 
YKNFQVNYSYLASPEYNDFQI^LSEFAKRKTD^ 
. FHRIADFSKDGGESYFMQDTIHLGWNGWIJ^ 

30 

SEQ ID NQ: 60 

ATGCTTAAGAGACTCTGGTTAATTCTAGGTCCT 
TTAGTITTCCTACACAACITGATCATTC 

TTCTTTTAAAAATGGTTTGATTAAAAGACAAG 
35 GGTTCTAGCGAATGGAGTCGAATtK^TAGTATGCACCCTTCGGTGCTTGCAGA 

ATAGACCATTTTTAATTGGTAAGAGAGGATCAGCATCTT^ 

CAATGAAATGCAAAAGAAAAAAGCCATCTTTGTAGTATCTCCTCAATGGTTTACT^ 

CCTACTGCGGTTCAGATGTACTTGTCTAACACTdAAGTGATT^ 

AAGMTCACAGTTTGCAGCAAAGCGTTTGCTTGA 
40 AAAAGTAAGTAAGGGTAAGTCTCTTAGTCGGTTAGACAGAG 

TTt^GAGAAGAGTCCCTTTTTAGTTTrTTAGGCAAATCTACT • 

TTAAGGGATTACCTAAAGTATTTTOT^ 

aacaAccaacaaccgttttgggattaaaaatacattttat • 
tataagaatttccaagttaattatagttacc^^ 
45 cagaatttgctaaacgaaaaacagatgtactctttgt^ 

taccx3gcttaaatcaagataagtatcaagcggcagttcgtaa 
tttcatcgcattgctgacttctcaaaagatggt^ 
gttggaatggctggttagcttttgataagaaag 
ctataaaatgaacccttatttttatactaaaatttgggcaaataggaaagactrc 

50 
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Preferred GAS 202 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%. 99%, 99.5% or more) to SEQ ID NO: 59; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 59, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
5 30, 35, 40, 50, 60, 70, 80, 90, 1 00, 1 50, 200 or more). These GAS 202 proteins include variants (eg. 
allelic variants, homologs, orthologs, paralogs, mutants, e/c.) of SEQ ID NO: 59. Preferred fragments 

■ 

of (b) comprise an epitope from SEQ ID NO: 59. Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-tenninus of SEQ ID 
10 NO: 59. Other fragments omit one or more domains of the protein (eg. omission of a signal peptide, 
of a cytoplasmic domain, of a transmembrane domain, or of an extracellular domain). 

(31) CAS 057 

GAS 057 corresponds to Ml GenBank accession numbers GI: 13621655 and GI: 15674549, to M3 
GenBank accession number GI: 21909834, to M18 GenBank accession number GI: 19745560 and is 
15 also referred to as 'Sp^W (Ml), *SpyM3_0298' (M3), 'SpyM18_0464' (M18) and 'prtS*. GAS 
057 has also been identified as a putative cell envelope proteinase. Amino acid and polynucleotide 
sequences of GAS 057 of an Ml strain are set forth below: 

SEQ ID NO: 61 

MBKKQRFSLRKYKSCTFSVLIGSVFliVMTCT 
20 TS Q I TL KTNRE KEQSQDLVS B PTTTE LADTD AAS MANTG S D ATQ KS AS L P P VNTD VHD WVKT KGA WD KG Y 
KGQGKWAVI DTGI D PAHQSMRI SDVSTAKVKSKEDM1JUIQKAAGIMYGSWINDKWPAHNYVENSDNI K 
BNQFBDFDEDWENFEFDAEAE PKAI KKHKI YRPQSTQAPKBTVI KTBETDGSHDI DWTQTDDDTKYBSHG 
MHVTGIVAGNSKEAAATGERFLGIAPEAQVWFMRVFANM^ 

NGAQLSGSKPLMEAI E KAKKAGVSVWAAGKERVYGSDHDD P1ATNPD YGLVGS PSTGRTPTSVAAI NS K 
25 WVIQRU4TVKBLENRADLNHGKAIYSESVDFKDI 

RD PNKTYDEM I ALAKKHGALGVLI FNNKPGQSNRSMRLTANGMGI PSAFI SHE FGKAMSQLNGNGTGSLE 

FDSWSKAPSQKGNEMNHFSNWGLTSDGYLKPDITAPGGDIYSTYOTIWYGSQTCT 

KQYLEKTQPKLPKEKIADINrT(NLI24SNAQIHVNPETiaTrSPRQ 

ISIX^ITDTOTFDVTVHNLSNKDKTLRYDTELi 
30 VTMDVSQFTKELTKQMPNGYYLEGFWFRDSQDDQLNRVNI PFVGFKGQFENLAVAEBSI YRLKSQGKTG 

FYFDBSGPKDDIYVGKHFTGLVTU3SETNVSTKTISDNGLHTU3TFKNATC 

GDNNQDFAAFKGVFLRKYQGLKAS VYHASDKEH KN PLWVS PES FKGDKN FNSD I RFAKSTTLLGTAF SGK 
SLTGAELPDGHYHYWSYYPDWGAKRQ^FDMIIJJRQKPVLSQATFDPETNRFKPEPLKDRGIA 
DS VFYLERKDNKPYTVTINDS YKYVS VBDNKTFVBRQADGS FI LPLDKAKU5DFYYMVBDFAGNVAI AKL 
35 GDHLPQTLGKTPI KLKLTDGNYQTKETLKDNLBMTQSDTGLVTN^ I S 

PNEDGNKDFVAF KGLKNNVY1TOLTVNVYAKDDHQKQTP I WS SQAGAS VS A I ESTAWYG I TARGS KVMPGD 
YQYVVTYRDEHGKEHQKQYTISVNDKKPMITQGRFDT^^ 

RKFDVTBGKDGITVSDNKVY I PKNPDGS YTI SKRDGVTLSDYYYLVEDRAGNVS FATLRDLKAVGKDKAV 
VKFGLDLPVPEDKQIVNFTYLVRDAIX3KPIENLEYYNNSGNSLILPYGKYTVBLLTYD 
40 SmSADNNFMVTFKITMLATSQITAHFDHLLPBGSRVSLKTAQDQLI PLEQSLYVPKAYGKTVQEGTY 
BVVVSLPKGYRIB(^TKVNTLPNEVHEI.SLRLVKVGDASDSTC 
AKALPSTGEKMGLKLRIVGLVIjIiGLTCVFSRKKSTKD 

SEQ ID NO: 62 

45 GTGGAGAAAAAGCAACGTTTTTCCCTTAGAAAATACAAAT 
TTTTCTTGGTGATGACAACAACAGTAGCAGCAGAT^ 

TCACGCTCAACAACAAGCGCAACATCTCACCAATACAGAGTTGAGCT^ 
ACATCACAAATCACTCTCAAGACAAATCGTGAAAAAGAGCAATCACAAGATCTAGT 

CAACTGAGCTAGCTGACACAGATGCAGCATCAATGGCTAATACATC 
50 TTCITrACCGCCAGTCAATACAGATGTTC^CGATTGGGTAAAAACCAAAG 
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AAAGGACAAGGCMGGTTGTCGCACTTATTG 

GTGATGTATpUVCTGCTAAAGTAAAATCAAAAGAAGACATGCTAGCACGCCAAAAA 
TTATGGGAGTTOGATAAATGATAAAGTTGTTTTO 
GAAAATCAATTCGAGGATTTTGATGAGGACTGGGAAAACTTTGA 
S CCATCAAAAAACACAAGATCTATCGTrcCCAATCAACCCAGGCACCGAAAGAAACTGTT 
AGAAACAGATGGTTCACATGATATTGACTGGACACAAACAGACGATGACA 

ATGCATGTGACAGGTATTGTAGCCGGTAATAGCAAAGAAGCCGCTGCT TAGGAA 

TTGCACCAGAGGCCCAAGTCATGTTCATGCGTGTTTTTGCCAA 

CTTTATOUUVGCTATCGAAGATGCCGTGQCm 
1 0 AATGGGGCACAGCTTAGTGGCAGCAAGCCTCTAATGGAAGCAATTGAAAAAOT 

CAGTTGTTGTAGCAGCAGGAAATGAGCGCGTCTATGGATCT^ 

AGACTATGGTTTGGTCGGTTCTCCCTCAACAGGTC 

TGGGTGATTCAACGTCTMTOACGGTCAAAGAATTAGAAAACCGTG 

TCTATTCAGAGTCTGTCGACTTTAAAGACATAAAAGATAGCCTAGGTTATGATAAATOT 
1 5 TTATGTCAAAGAGTCAACTGATGCGGGTTATAACGCACAAGACGTTAAAGOTAA 

CGTGATCCCAATAAAACCTATGACGAAATGATTGCTTTGGCTAAGA 

TTTTTAATAA(^AGCCTGGTCAATCAAACCGCTCAATC 

TGCTTTCATATCGCACGAATTTGGTAAGGCCATGTCCCAATTAAATGGC 

TTTGACAGTGTGGTCTCAAAAGCACCXSAGTCAAAAAGGCAATGAAATGAATC^ 
20 TAACTTCTGATGGCTATTTAAAACCTGACATTACT 

TAACCACTATGGTAGCCAAACAGGAACAAGTATGGCCTCTCCTC^ 

AAACAATACCTAGAAAAGACTCAGCCAAACTTGCCAAAAGAAAAAATTC 

TGATGAGCAATGCTCAAATTCATGTTAATCCAGAGACAAAAACGACCACCTCAC CGCGTCAGCAAGGGGC 
AGGATTACTTAATATTGACGGAGCTGTCACTAGCGGCCm 
25 ATATCATTAGGOUVCATCACAGATACGATGACGTTTGATGTC 

AAACATTACGTTATGACACAGAATTGCTAACAGATCATGTAGACCCACAAAAGGGCTO 
TTCTCACTCCTTAAAAACGTACCAAGGAGGAGAAGTTACAGTCCCAGCXIAATC 

GTTACCATGGATGTCTCACAGTTCACAAAAGAGCTAACAAAACAGATGCCAA^ 
GTTTTGTCCGCTTTAGAGATAGTCAAGATGACCAACTAAATAGAGTA^ 
30 AGGGCAATTTGAAAACTTAGCAGTTGCAGAAGAGTCCATTTACAGATTAAAATC^ 
TTTTACITTGATGAATCAGGTCCAAAAGACGATATC 
TTGGTTCAGAGACCAATGTGTCAACCAAAACGATTTCTGACAATGGTCT 

AAATGCAGATGGCAAATTTATCTT AGAAAAAAATGC CCAAGGAAACCCKn'CTTAGCCATTTCTCCAAAT 

GGTGACAACAACCAAGATTTTGCAGCCTTCAAAGGTGriTKrr 
35 GTGTCTACCATGCTAGTGACAAGGAACACAAAAATCCACTC 

TAAAAACTTTAATAGTGACATTAGATTTGCAAAATCAACGACCCTGTTAGGCAC^ 

TCGTTAACAGGAGCTGAATTACCAGATGGGCATTATCATTATGTGGTGT 

GTGCCAAACGTCAAGAAATGACATTTGACATGATTTTA^ 

ATTTGATCCTGAAACAAACCGATTCAAACCAGAACCCCTAAAAGACCG 
40 GACAGTGTCTTTTATCTAGAAAGAAAAGAC^CAAGCCTTATACA^ 

ATGTCTCAGTAGAAGACAATAAAACATTTGTGGAG£GACAAGCTC^ 

TAAAGCAAAATTAGGGGATTTCTATTACATGGTCGAGGATTTTGCAGGGAACX5 

GGAGATCACTTACCACAAACATTAGGTAAAA«CCAATTAAACTTAAGCTTA 

CCAAAGAAACGCTTAAAGATAATCTTGAAATGACACAGTCTGACACAGGTCT^ 
45 GCTAGCAGTGGTGCACCGCAATCAGCCGCAAAGCCAGCTAACAAAGATGAATCAGGAl 

CCAAACGAAGATGGGAATAAAGACTTTGTGGCCTTTAAAGGCTTGAAAAA 

CGGTTAACGTATACGCTAAAGATGACCACCAAAAACAAACCCCTATCTGGTCT 
. TGTATCCGCTATTGAAAGTACAGCCTGGTAT^^ 

TATCAGTATGTTGTGACCTATCGTGACGAACATGGTAAAGAACATCAAAAG 
50 ATGACAAAAAACCAATGATCACTCAGGGACGTTTTGATACCATTAA 

CAAGACAAAAGCCCHTGACTCATCAGGCATTGTCaSCGAAGAAG 

CXSTAAATTTGATGTGACAGAAGGTAAAGATGGTATCACAGTTAGTGAO 

ATCCAGATGGTTCTTACACCATTTCAAAAAGAGATGGTGTCACACTGTCAGATTATTACT 

AGATAGAGCTGGTAATGTCTCTTITCCTACCTTGOSTGACCT 
. 55 GTCAATTTTGGATTAGACnTACCGGTCCCTGAAGACAAACAAATAGTGAA 

ATGCAGATGGTAAACCGATTGAAAACCTAGAGTATTATAATAACTCAGGTAACAGTCTT 

OGGCAAATACACGGTCGAATTGTTGACCTATGACACCAATGCAGCCAAACTAGACT 

TCCTTTACCTTGTCAGCTGATAACAACTTCCAACAAGTTA 

AAATAACTGCCCACTTTGATCATCTTTTGCCAGAAGGCAGTCGCGTTAGCCOT 
60 GCTAATCCCGCTTGAACAGTCOTGTATGTGCCT 

GAAGTTGTTGTCAGCCTGCCTAAAGGCTACCGTATCGAAGGCAACA 

AAGTGCACGAACTATCATTACGCCTTGTCAAAGTAGGAGATGCCTCAG^ 
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TATGTCAAAAAATAATTCACACGCTTTGACACCCTCTCCCACACCA^ 

GCAAAAG CCCTACCATCAACGGQTGAAAAAATGGGTCTCAAGTTQCGCAT ACT AGGT^ 

QACTTACTT G CXjlCnTAGCOGAAAAAAATCAACCAAAGATTGA 

5 Preferred GAS 057 proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 61; and/or (b) which is a fragment of at least it 
consecutive amino acids of SEQ ID NO: 61, wherein n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200 or more). These GAS 057 proteins include variants (eg. 

10 allelic variants, homology orthologs, paralogs, mutants, etc) of SEQ ID NO: 61 . Preferred fragments 
of (b) .comprise an epitope from SEQ ID NO: 61 . Other preferred fragments lack one or more amino 
acids (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more 
amino acids (e.g: 1,2,3,4,5,6,7,8,9, 10, 15,20,25 or more) from the N-terminus of SEQ ID 
NO: 61 . For example, in one embodiment, the underlined amino acid sequence at the N-terminus of 

1 5 . SEQ ID NO: 6 1 is removed. In another example, the underlined amino acid sequence at the C- 
terminus of SEQ ID NO: 61 is removed. Other fragments omit one or more domains of the protein 
(eg. omission of a signal peptide, of a cytoplasmic domain, of a transmembrane domain, or of an 

* 

extracellular domain). 

The immunogenic ity of other known GAS antigens may be improved by combination with two or 
20 more GAS the first antigen group. Such other known GAS antigens include a second antigen group 
consisting of (1) one or more variants of the M surface protein or fragments thereof, (2) fibronectin- 
binding protein, (3) streptococcal heme-associated protein, or (4) SagA. These antigens are referred 
to herein as the "second antigen group". 

The invention thus includes an immunogenic composition comprising a combination of GAS 
25 antigens, said combination consisting of two to thirty-one GAS antigens of the first antigen group and 
. one, two, three, or four GAS antigens of the second antigen group. Preferably, the combination 

consists of three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. 

Still more preferably, the combination consists of three, four or five GAS antigens from the first 

antigen group. Preferably, the combination of GAS antigens includes either or both of GAS 40 and 
30 GAS 1 17. Preferably, the combination of GAS antigens includes one or more variants of the M 

surface protein. 

Each of the GAS antigens of the second antigen group are described in more detail below. 
(1) M surface proton 

■ 

* 

Over 100 different type variants of the M protein have been identified. Epitopes having increased 
35 bactericidal activity and having decreased likelihood of cross-reacting with human tissues have been 
identified in the amino terminal region and combined into fusion proteins containing approximately 
six, seven, or eight M protein fragments linked in tandem. Sefe Ref. 4, 5, 6, WO 02/09485 1 and WO 
94/06465. (Each of the M protein variants, fragments and fusion proteins described in these 
references are specifically incorporated herein by reference.) 
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Accordingly, the compositions of the invention may further comprise a GAS M surface protein or a 
fragment or derivative thereof. One or more GAS M surface protein fragments may be combined 
together in a fusion protein. Alternatively, one or more GAS M surface protein fragments are 
combined with a GAS antigen or fragment thereof of the first antigen group. One example of a GAS 
S M protein is set forth below. 

SEQIDNO:63 

MAKNNTNRHYS1AKLKTGTASVAVALTVLGAGPANQTBVKANG 
KDLKARLBNAMEVAGRDFKRAEBLEKAKQAL 
KEALBLAIIX}ASM>YHR&TAI£KELEB 
10 KLELDQLSS B KBQLTI B KAKLBBB KQI SDASRQS1JUU)LDASRBAKK0VBKDLANLTABLDKVKEDKQI S 

DASRQGLIUU>LDASREAKKQVBKDLANLTABLDKVl^ 
NSKIJ^ALEKLNKELEESKKLTEKEKABLQAKLEABAKAL 

KAVPGKC^APQAGTKPN^KAPMKETKRQLPSTGBTANPPFTAAALTVMATAGVAAVVKRKBKN 

1 5 Preferred GAS M proteins for use with the invention comprise an amino acid sequence: (a) having 
50% or more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 
97%, 98%, 99%, 99.5% or more) to SEQ ID NO: 63; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 63, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, 50, 60, 70, 80, 90, 100, 150 or more). These GAS M proteins include variants (eg. allelic 

20 variants, homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 63. Prefened fragments of (b) 
comprise an epitope from SEQ ID NO: 63. Preferably, the fragment is one of those described in the 
references above. Preferably, the fragment is constructed in a fusion protein with one or more 
additional M protein fragments. Other preferred fragments lack one or more amino acids (e.g. 1, 2, 3, 
4, 5, 6, 7, 8, 9, 10, 1 5, 20, 25 or more) from the C-terminus and/or one or more amino acids (e.g. 1 , 2, 

25 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the N-terminus of SEQ ID NO: 63. Other fragments 
omit one or more domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, 

* • 

of a transmembrane domain, or of an extracellular domain). 

• * 

(2) Fibronectin-binding protein 

GAS fibronectin-binding protein ('Sfbl') is a mutli functional bacterial protein thought to mediate 
30 attachment of the bacteria to host cells, facilitate bacterial internalization into cells and to bind to the 
Fc fragment of human IgG, thus interfering with Fc-receptor mediated phagocytosis and antibody- 
dependent cell cytotoxicity. Immunization of mice with Sfbl and an 'HI 2 fragment 9 (encoded by 
positions 1240 - 1854 of the Sfbl gehe) are discussed in Refe. 7,8 and 9. One example of an amino 
. acid sequence for GAS Sfbl is show below. 

35 SEQ ID NO: 64 

MS FDGFFLHHLTNELKENLLYGRI QKVNQPFERBLVIiTI RlHiRKNYKLLLS AHPVFGRVQI TQADFQNPQ 
VPNTFTMIMRKYLQC^VIEQLBQIDOTRIIE ILVDRAENKI I 

ESIKHVGFSQNSYRTILPGSTYIEPPKTAAVNPFTITO 
ABLLTTDKLKRFRBFFARPTQANLTTASFAPVL 
40 ' RVQTELDKNRNKLSKQEABUATEN^^ 

PNQNAQRYFKKYQKLKEAVKHLSGLI ADTKQS I TYFBSVDYNLSQAS I DDI BDIREELYQAGFLKSRQRD 
KRHKRKKPBQYIASDGTTILMVGRIWLQOT^ PGSHVI I KDNLDPSDBVKTDA 

AELAA YYS KARLSNLVQVDM I EAKKLHKPSGAKPGFVTYTGQKTLRVTPDQAKI LSMKLS 
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Preferred Sfbl proteins for use with the invention comprise an amino acid sequence: (a) having 50% 
or more identity (c* 60%, 65%, 70%, 75%, 80%, 85%. 90%, 91%, 92%, 93%. 94%, 95%, 96%, 
97%, 98%. 99%, 99.5% or more) to SEQ ID NO: 64; and/or (b) which is a fragment of at least n 

S consecutive amino acids of SEQ ID NO: 64, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 
30, 35, 40, SO, 60, 70, 80, 90, 100, or more). These Sfbl proteins include variants (eg. allelic variants, 
homologs, orthologs, paralogs, mutants, etc.) of SEQ ID NO: 64. Preferred fragments of (b) comprise 
an epitope from SEQ ID NO: 64. Preferably, the fragment is one of those described in the references 
above. Other preferred fragments lack one or more amino acids (e.g. 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, IS, 

10 20, 25 or more) from the C-terminus and/or one or more amino acids (e.g. 1,2,3. 4, S, 6, 7, 8, 9, 10, 
IS, 20, 25 or more) from the N-terminus of SEQ ID NO: 64. Other fragments omit one or more 
domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, of a 
transmembrane domain, or of an extracellular domain). 

(3) Streptococcal heme-assoclaied protein 

1 S The GAS streptococcal heme-associated protein ('Shp') has been identified as a GAS cell surface 
protein. It is thought to be cotrascribed with genes encoding homologues of an ABC transporter 
involved in iron uptake in gram-negative bacteria. The Shp protein is further described in 10. One 
example of a Shp protein is shown below: 

SEQIDNO:65 

20 MTKWI KQLLQVI WFMI SLSTMTNLVYADKGQI YGCI IQRNYRHPI SGQIEDSGGEHS FDIGQGMVEGT 
VYSDAMLEVSDAGKIVLTFRMSLADYSGNYQFVJIQPGGTC 
NSIIRGSMFVEPMGREWFYLSASELIQKYSGNM^ 
GAMITONKPKANSSNNKSLSDKKILPSKMGLTTSLE 
WKKRKKNDKTM 

25 

Preferred Shp proteins for use with the invention comprise an amino acid sequence: (a) having 50% or 
more identity (eg. 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 
98%, 99%, 99.5% or more) to SEQ ID NO: 65; and/or (b) which is a fragment of at least n 
consecutive amino acids of SEQ ID NO: 65, wherein n is 7 or more (eg. 8, 10, 12, 14, 16, 18, 20, 25, 

30 30, 35, 40, 50, 60, 70, 80, 90, 100 or more). These Shp proteins include variants (eg. allelic variants, 
. homology orthologs, paralogs, mutants, etc) of SEQ ID NO: 65. Preferred fragments of (b) comprise 
an epitope from SEQ ID NO: 65. Other preferred fragments lack one or more amino acids (e.g. 1, 2, 
3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25 or more) from the C-terminus and/or one or more amino acids (e.g. 1, 
2,3,4,5,6,7,8,9,10, 1 5, 20, 25 or more) from the N-terminus of SEQ ID NO: 65. Other fragments 

3 5 omit one or more domains of the protein (eg. omission of a signal peptide, of a cytoplasmic domain, 
of a transmembrane domain, or of an extracellular domain). 

« 

(4) SagA 

■ 

Streptolysin S (SLS), also known as 'SagA*, is thought to be produced by almost all GAS colonies. 

This cytolytic toxin is responsible for the beta-hemolysis .surrounding colonies of GAS grown on 

40 blood agar and is thought to be associated with virulence. While the full SagA peptide has not been 
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m 

shown to be immunogenic, a fragment of amino acids 10 - 30 (SagA 10 - 30) has been used to 
produce neutralizing antibodies. See Ref. 11. The amino acid sequence of SagA 10 - 30 is shown 

♦ 

below: 

SEQ ID NO: 66 FSIATGSGNSQGGSGSYTPGKC 

5 Preferred SagA 10-30 proteins for use with the invention comprise an amino acid sequence: (a) 
having 50% or more identity (eg. 60%. 65%, 70%, 75%, 80%, 85%, 90%, 9 1%. 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, 99.5% or more) to SEQ ID NO; 66; and/or (b) which is a fragment of at 

a 

least n consecutive amino acids of SEQ ID NO: 66, wherein n is 7 or more (e g. 8, 10, 12, 14, 16, 18, 
or 20). These SagA 10-30 proteins include variants (eg. allelic variants, homologs, orthologs, 
1 0 paralogs, mutants, etc) of SEQ ID NO: 66. 

There is an upper limit to the number of GAS antigens which will be in the compositions of the 
invention. Preferably, the number of G AS antigens in a composition of the invention is less than 20, 

* 

less than 19, less than 18, less than 17, less than 16, less than IS, less than 14, less than 13, less than 
12, less than 1 1 , less than 10, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, 
IS or less than 3. Still moie preferably, the number of GAS antigens in a composition of the invention is 
less than 6, less than 5, or less than 4. Still more preferably, the number of GAS antigens in a 
composition of the invention is 3. 

The GAS antigens used in the invention are preferably isolated, i.e., separate and discrete, from the 
whole organism with which the molecule is found in nature or, when the polynucleotide or 
20 polypeptide is not found in nature, is sufficiently free of other biological macromolecules so that the 
polynucleotide or polypeptide can be used for its intended purpose. 

Fusion proteins . 

The GAS antigens used in the invention may be present in the composition as individual separate 
polypeptides, but it is preferred that at least two (ie. 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 
25 18, 19 or 20) of the antigens are. expressed as a single polypeptide chain (a 'hybrid* polypeptide). 

■ 

Hybrid polypeptides offer two principal advantages: first, a polypeptide that may be unstable or 

* 

poorly expressed on its own can be assisted by adding a suitable hybrid partner that overcomes the 
problem; second, commercial manufacture is simplified as only one expression and purification need 
be employed in order to produce two. polypeptides which are both antigenically useful. 

30 The hybrid polypeptide may comprise two or more polypeptide sequences from the first antigen 

group. Accordingly, the invention includes a composition comprising a first amino acid sequence and 
a second amino acid sequence, wherein said first and second amino acid sequences are selected from a 
GAS antigen or a fragment thereof of the first antigen group. Preferably, the first and second amino 
acid sequences in the hybrid polypeptide comprise different epitopes. 

m 

35. The hybrid polypeptide may comprise one or more polypeptide sequences from die first antigen group 

and one or more polypeptide sequences from the second antigen group. Accordingly, the invention 
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includes a composition comprising a first amino acid sequence and a second amino acid sequence, 
said first amino acid sequence selected from a GAS antigen or a fragment thereof from the first 

♦ 

antigen group and said second amino acid sequence selected from a CAS antigen or a fragment 
thereof from the second antigen group. Preferably, the first and second amino acid sequences in the 
5 hybrid polypeptide comprise different epitopes: 

Hybrids consisting of amino acid sequences from two, three, four, five, six, seven, eight, nine, or ten 
GAS antigens are preferred. In particular, hybrids consisting of amino acid sequences from two, 
three, four, or five GAS antigens are preferred. 

■ 

Different hybrid polypeptides may be mixed together in a single formulation. Within such 
10 combinations, a GAS antigen may be present in more than one hybrid polypeptide and/or as a 

non-hybrid polypeptide. It is preferred, however, that an antigen is present either as a hybrid or as a 
non-hybrid, but not as both. 

« 

Hybrid polypeptides can be represented by the formula NH 2 -A-{-X-L-}*-B-COOH, wherein: X is an 
amino acid sequence of a GAS antigen or a fragment thereof from the first antigen group or the 
1 S . second antigen group; L is an optional linker amino acid sequence; A is an optional N-terminal amino 
acid sequence; B is an optional C-tenninal amino acid sequence; and n is 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 
12, 13, 14 or 15. 

If a -X- moiety has a leader peptide sequence in its wild-type form, this may be included or omitted in 
the hybrid protein. In some embodiments, the leader peptides will be deleted except for that of the -X- 

20 moiety located at the N-terminus of the hybrid protein i.e. the leader peptide of Xi will be retained, . 

> 

but the leader peptides of X 2 . . . Xo will be omitted. This is equivalent to deleting all leader peptides 
and using the leader peptide of Xi as moiety -A-. 

* » 

For each n instances of {-X-L-}, linker amino acid sequence -L- may be present or absent. For 
instance, when n=2 the hybrid may be NH 2 -X r L r X 2 -L2-COOH, NH 2 -X r Xi-COOH, NH 2 -X r L,-X 2 - 

25 COOH, NH 2 -X| -Xj-La-COOH, etc linker amino acid sequence^) -L- will typically be short (eg. 20 
or fewer amino acids i.e 19, 18, 17, 16, 15, 14, 13, 12, 1 1, 10, 9, 8, 7, 6, 5, 4; 3, 2, 1). Examples 
comprise short peptide sequences which facilitate cloning, poly-glycine linkers (Le. comprising Gly„ 
where n = 2, 3, 4, 5, 6, 7, 8, 9, 10 or more), and histidine tags (f.e His* where n = 3, 4, 5, 6, 7, 8, 9, 10 
or more). Other suitable linker amino acid sequences will be apparent to those skilled in the art. A 

30 . useful linker is GSGGGG, with the Gly-Ser dipeptide being formed from a BamYQ. restriction site, 
thus aiding cloning and manipulation, and the (Gly) 4 tetrapeptide being a typical poly-glycine linker. 

-A- is an optional N-terminal amino acid sequence. This will typically be short (eg. 40 or fewer 
amino acids Le. 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 
16, 15, 14, 13, 12, 1 1, 10, 9, 8, 7, 6, 5, 4, 3, 2, l)..Examples include leader sequences to direct protein 

• * 

35 trafficking, or short peptide sequences which facilitate cloning or purification (eg. histidine tags i.e 

• * 

HiSa where n ° 3, 4, 5, 6, 7, 8, 9, 10 or more). Other suitable N-terminal amino acid sequences will be. 

» * 
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apparent to those skilled in the art. If X t lacks its own N-terminus methionine, -A- is preferably an 
oligopeptide (eg. with 1, 2, 3 f 4, 5, 6, 7 or 8 amino acids) which provides a N-tenninus methionine. 

•B- is an optional C-termina) amino acid sequence. This will typically be short (eg. 40 or fewer 
amino acids Le. 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29. 28, 27, 26, 25. 24, 23, 22, 21. 20, 19, 18, 17, 
5 16, 15, 14, 13, 12, 1 1, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1). Examples include sequences to direct protein 

trafficking, short peptide sequences which facilitate cloning or purification (eg. comprising histidine 
tags i.e. His, where n - 3, 4, 5, 6, 7, 8, 9, 10 or more), or sequences which enhance protein stability. 
Other suitable C-terminal amino acid sequences will be apparent to those skilled in the art. 

Most preferably, n is 2 or 3. 

10 The invention also provides nucleic acid encoding hybrid polypeptides of the invention. Furthermore, 
the invention provides nucleic acid which can hybridise to this nucleic acid, preferably under "high 
stringency" conditions (eg. 65°C in a 0. lxSSC, 0.5% SDS solution). 

Polypeptides of the invention can be prepared by various means (eg. recombinant expression, 
purification from cell culture, chemical synthesis, etc.) and in various forms (eg. native, fusions, 
15 non-glycosylated, lipidated, etc.). They are preferably prepared in substantially pure form (Le. 
ntially free from other GAS or host cell proteins). 







•3 





Nucleic acid according to the invention can be prepared in many ways (eg. by chemical synthesis, 
from genomic or cDNA libraries, from the organism itself, etc.) and can take various forms (eg. 
single stranded, double stranded, vectors, probes, etc.). They are preferably prepared in substantially 
20 pure form (i.e substantially free from other GAS or host cell nucleic acids). 

The term •'nucleic acid" includes DNA and RNA, and also their analogues, such as those containing 
modified backbones (eg. phosphorothioates, ete.), and also peptide nucleic acids (PNA), etc The 
invention includes nucleic acid comprising sequences complementary to those described above (eg. 
for antisense or probing purposes). 

25 The invention also provides a process for producing a polypeptide of the invention, comprising the 
step of culturing a host cell transformed with nucleic acid of the invention under conditions which 
induce polypeptide expression. 

* 

The invention provides a process for producing a polypeptide of the invention, comprising the step of 

* 

synthesising at least part of the polypeptide by chemical means. 

30 The invention provides a process for producing nucleic acid of the invention, comprising the step of 
amplifying nucleic acid using a primer-based amplification method (eg. PCR). 

* 

The invention provides a process for producing nucleic acid of the invention, comprising the step of 
synthesising at least part of the nucleic acid by chemical means. 
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Strains 

* Preferred polypeptides of the invention comprise an amino qcid sequence found in an Ml, M3 or M18 
strain of GAS. The genomic sequence of an Ml GAS strain is reported at Ref. 12. The genomic 

• sequence of an M3 GAS strain is reported at Ref. 13. The genomic sequence of an M18 GAS strain is 
5 reported at Ref. 14. 

Where hybruf polypeptides are used, the individual antigens within the hybrid (i.e. individual -X- 
moieties) may be from one or more strains. Where n-2, for instance, X 2 may be from the same strain 
as Xi or from a different strain. Where n«3, the strains might be (i) XrX 2 -Xj (ii) Xi«OCj0Cj (iii) 
XifX2-Xy(w)XifX7fX><x(y)XrX}fr7>etc. 

■ 

10 Purification and Recombinant Expression 

The GAS antigens of the invention may be isolated from a Streptococcus pyogenes, or they may be 
recombinants produced, for instance, in a heterologous host. Preferably, the GAS antigens are 
prepared using a heterologous host. The heterologous host may be prokaryotic (e.g. a bacterium) or 
eukaryotic. It is preferably Exoli, but other suitable hosts include Bacillus subtilis, Vibrio cholerae, 

15 Salmonella typhi, Salmonella typhimurium. Neisseria lactamica, Neisseria cinerea, Mycobacteria 
(e.g. M.tuberculosis), yeasts, etc. 

* 

Recombinant production of polypeptides is facilitated by adding a tag protein to the GAS antigen to 
be expressed as a fusion protein comprising the tag protein and the GAS antigen. Such tag proteins 

r 

can facilitate purification, detection and stability of the expressed protein. Tag proteins suitable for 
20 use in the invention include a polyarginine tag (Arg-tag), polyhistidine tag (His-tag), FLAG-tag, 

Stiep-tag, c-myc-tag, S-tag, calmodulin-binding peptide, cellulose-binding domain, SBP-tag,, chitin- 

binding domain, glutathione S-transferase-tag (GST), maltose-binding protein, transcription 
. termination anti-tenniniantion factor (NusA), £. co/i thioredoxin (TntA) and protein disulfide 

isomerase I (DsbA). Preferred tag proteins include His-tag and GST. A full discussion on the use of 
25 tag proteins can be found at Ref. 1 5. 

After purification, the tag proteins may optionally be removed from the expressed fusion protein, i.e., 
by specifically tailored enzymatic treatments known in the art. Commonly used proteases include 

■ 

enterokinase, tobacco etch vims (TEV), thrombin, and factor X t . 

■ ■ ■ 

■ « 

Immunogenic compositions and medicaments 
30 Compositions of the invention are preferably immunogenic compositions, and are more preferably 

* ■ 

vaccine compositions. Hie pH of the composition is preferably between 6 and 8, preferably about 7. 
The pH may be maintained by the use of a buffer. The composition may be sterile and/or 
pyrogen-free. The composition may be isotonic with respect to humans. 

Vaccines according to the invention may either be prophylactic to prevent infection) or 
35 therapeutic to treat infection), but will typically be prophylactic. Accordingly, the invention 

includes a method for the therapeutic or prophylactic treatment of a Streptococcus pyogenes infection 

-40- 



PATENT APPLICATION 
ATTY REF NO. PP20663.002 

in an animal susceptible to streptococcal infection comprising administering to said animal a 
therapeutic or prophylactic amount of the immunogenic compositions of the invention. Preferably, 

. the immunogenic composition comprises a combination of GAS antigens, said combination consisting 
of two to thirty-one GAS antigens of the first antigen group. Preferably, the combination of GAS 
5 antigens consists of three, four, five, six, seven, eight, nine, or ten GAS antigens selected from the 
first antigen group. Preferably, the combination of GAS antigens consists of three, four, or five GAS 
antigens selected from the first antigen group. Preferably, the combination of GAS antigens includes 

. either or both of GAS 40 and GAS 117. 

Alternatively, the invention includes an immunogenic composition comprising a combination of GAS 
10 antigens, said combination consisting of two to thirty-one GAS antigens of the first antigen group and 
one, two, three, or four GAS antigens of the second antigen group. Preferably/the combination 
consists of three, four, five, six, seven, eight, nirfe, or ten GAS antigens from the first antigen group. 
Still more preferably, the combination consists of three, four or five GAS antigens from the first 
antigen group. Preferably, the combination of GAS antigens includes either or both of GAS 40 and 
IS GAS 117. Preferably, the combination of GAS antigens includes one or more variants of the M 
surface protein. 

The invention also provides a composition of the invention for use as a medicament. The medicament 
is preferably able to raise an immune response in a mammal (i.e it is an immunogenic composition) 

* 

and is more preferably a vaccine. 

20 The invention also provides the use of the compositions of the invention in the manufacture of a 
medicament for raising an immune response in a mammal. The medicament is preferably a vaccine. 

The invention also provides for a kit comprising a first component comprising a combination of GAS 
antigens. In one embodiment, the combination of GAS antigens consists of a mixture of two to thirty- 
one GAS antigens selected from the first antigen group. Preferably, the combination consists of three, 

► • • 

25 four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. Preferably, the 
combination consists of three, four, or five GAS antigens from the first antigen group. Preferably, the 

* • 

combination includes either or both of GAS 1 1 7 and GAS 040. 

In another embodiment, the kit comprises a first component comprising a combination of GAS 
antigens consisting of a mixture of two to thirty-one GAS antigens of the first antigen group and one, 
30 two, three, or four GAS antigens of the second antigen group. Preferably, the combination consists of 
three, four, five, six, seven, eight, nine, or ten GAS antigens from the first antigen group. Still more 
preferably, die combination consists of three, four or five GAS antigens from the first antigen group. 
Preferably, the combination of GAS antigens includes either or both of GAS 40 and GAS 1 17. 
Preferably, the combination of GAS antigens includes one or more variants of the M surface protein. 

35 The invention also provides a delivery device pre-filled with the immunogenic compositions of the 
invention. 
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The invention also provides a method for raising an immune response in a mammal comprising the 
step of administering an effective amount of a composition of the invention. The immune response is 
preferably protective and preferably involves antibodies and/or cell-mediated immunity. The method 
may raise a booster response. 

5 The mammal is preferably a human. Where the vaccine is for prophylactic use, the human is 

• preferably a child (eg. a toddler or infant) or a teenager, where the vaccine is for therapeutic use, the 
human is preferably a teenager or an adult. A vaccine intended for children may also be administered 
to adults eg. to assess safety, dosage, immunogenicity, etc. 

These uses and methods are preferably for the prevention and/or treatment of a disease caused by 
10 Streptococcus pyogenes (eg. pharyngitis (such as streptococcal sore throat), scarlet fever, impetigo, 
erysipelas, cellulitis, septicemia, toxic shock syndrome, necrotizing fasciitis (flesh eating disease) and 
sequelae (such as rheumatic fever and acute glomerulonephritis)). The compositions may also be 
effective against other streptococcal bacteria. 

One way of checking efficacy of therapeutic treatment involves monitoring GAS infection after 
1 S administration of the composition of the invention. One way of checking efficacy of prophylactic 
treatment involves monitoring immune responses against the GAS antigehs in the compositions of the 
invention after administration of the composition. 

Compositions of the invention will generally be administered directly to a patient. Direct delivery may 
be accomplished by parenteral injection (eg. subcutaneously, intraperitoneally, intravenously, 
20 intramuscularly, or to the interstitial space of a tissue), or by rectal, oral (eg. tablet, spray), vaginal, 
topical, transdermal {eg. see ref. 16} or transcutaneous (eg. see refs. 17 & 18}, intranasal {eg. see 
reL 19}, ocular, aural, pulmonary or other mucosal administration. 

The invention may be used to elicit systemic and/or mucosal immunity. 

Dosage treatment can be a single dose schedule or a multiple dose schedule. Multiple doses may be 
25 used in a primary immunisation schedule and/or in a booster immunisation schedule. In a multiple 
dose schedule the various doses may be given by the same or different routes eg. a parenteral prime 
and mucosa] boost, a mucosal prime and parenteral boost, etc 

The compositions of the invention may be prepared in various forms. For example, the compositions 
may be prepared as injectables, either as liquid solutions or suspensions. Solid forms suitable for 

30 solution in, or suspension in, liquid vehicles prior to injection can also be prepared (eg. a lyophilised 
composition). The composition may be prepared for topical administration eg. as an ointment, cream 
or powder. The composition may be prepared for oral administration eg. as a tablet or capsule, as a 
spray, or as a syrup (optionally flavoured). The composition may be prepared for pulmonary 
administration eg. as an inhaler, using a fine powder or a spray. The composition may be prepared as 

35 a suppository or pessary. The composition may be prepared for nasal, aural or ocular administration 
eg. as. drops. The composition may be in kit form, designed such that a combined composition is 
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reconstituted just prior to administration to a patient. Such kits may comprise one or more antigens in 
liquid form and one or more lyophilised antigens. 

Immunogenic compositions used as vaccines comprise an immunologically effective amount of 
antigen(s), as well as any other components, as needed. By 'immunologically effective amount*, it is 

5 meant that the administration of that amount to an individual, either in a single dose or as part of a 
series, is effective for treatment or prevention. This amount varies depending upon the health and 
physical condition of the individual to be treated, age, the taxonomic group of individual to be treated 
' (eg. non-human primate, primate, etc), the capacity of the individual's immune system to synthesise 
antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's 

10 assessment of the medical situation, and other relevant factors. It is expected that the amount will fall 
in a relatively broad range that can be determined through routine trials. 

Further components of the composition 

The composition of the invention will typically, in addition to the components mentioned above, 
comprise one or more 'pharmaceutically acceptable carriers', which include any carrier that does not 
IS itself induce the production of antibodies harmful to the individual receiving the composition. 
Suitable carriers are typically large, slowly metabolised macromolecules such as proteins, 
polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, 
and lipid aggregates (such as oil droplets or liposomes). Such carriers are well known to those of 
ordinary skill in the art The vaccines may also contain diluents, such as water, saline, glycerol, etc. 

■ • 

20 Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, 
and the like, may be present A thorough discussion of pharmaceutically acceptable excipients is 
available in reference 20. 

■ 

■ 

Vaccines of the invention may be administered in conjunction with other immunoregulatory agents. In 
particular, compositions will usually include an adjuvant 

25 Preferred further adjuvants include, but are not limited to, one or more of the following set forth 
* below: 

A Mineral Containing Compositions 

Mineral containing compositions suitable for use as adjuvants in the invention include mineral salts, 
such as aluminium salts and calcium salts. The invention includes mineral salts such as hydroxides 
30 (eg. oxyhydroxides), phosphates (eg. hydroxyphoshpates, orthophosphates), sulphates, etc. {eg. see 
chapters 8 & 9 of ref. 21 }), or mixtures of different mineral compounds, with the compounds taking 
any suitable form (eg. gel, crystalline, amorphous, etc), and with adsorption being preferred. The 

■ 

mineral containing compositions may also be formulated as a particle of metal salt. See ref. 22. 

* * 

B. Oil-Emulsions 

35 Oil-emulsion compositions suitable for use as adjuvants in the invention include squalene-water 
emulsions, such as MF59 (5% Squalene, 0.5% Tween 80, and 0.5% Span 85, formulated into 
submicron particles using a microfluidizer). See ref. 23. 

* 

» 
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Complete Freund's adjuvant (CFA) and incomplete Fitund's adjuvant (IFA) may also be used as 
adjuvants in the invention. 

C. Saponin Formulations 

Saponin formulations, may also be used as adjuvants in the invention. Saponins are a heterologous 
5 group of sterol glycosides and triterpenoid glycosides that are found in the baric, leaves, stems, roots 
and even flowers of a wide range of plant species. Saponin from the bark of the Quillaia saponaria 
Molina tree have been widely studied as adjuvants. Saponin can also be commercially obtained from 
Smilax ornata (sarsaprilla), Gypsophilta.paniculato (brides veil), and Saponaria officianalis (soap 
root). Saponin adjuvant formulations include puriGed formulations, such as QS21 , as well as lipid 
1 0 formulations, such as ISCOMs. 

• a 
* 

Saponin compositions have been purified using High Performance Thin Layer Chromatography (HP- 
LC) and Reversed Phase High Performance Liquid Chromatography (RP-HPLC). Specific purified 
fractions using these techniques have been identified, including QS7, QS1 7, QS 18, QS21, QH-A, QH- 
B and QH-C. Preferably, the saponin is QS21. A method of production of QS21 is disclosed in U.S. 
1 S Patent No. 5,057,540. Saponin formulations may also comprise a sterol, such as cholesterol (see WO 
96/33739). 

Combinations of saponins and cholesterols can be used to form unique particles called 
Immunostimulating Complexs (ISCOMs). ISCOMs typically also include a phospholipid such as 
phosphatidylethanolamine or phosphatidylcholine. Any known saponin can be used in ISCOMs. 
20 Preferably, the ISCOM includes one or more of Quil A, QHA and QHC. ISCOMs are further 
described in EP 0 109 942, WO 96/1171 1 and WO 96/33739. Optionally, the ISCOMS may be 
devoid of additional detergent. Seeref. 24. 

A review of the development of saponin based adjuvants can be found at ref. 25. 

C. Virosomes and Virus Like Particles fVLPsl 

25 Virosomes and Virus Like Particles (VLPs) can also be used as adjuvants in the invention. These 

structures generally contain one or more proteins from a virus optionally combined or formulated with 
a phospholipid: They are generally non-pathogenic, non-replicating and generally do not contain any * 
of the native viral genome. The viral proteins may be recombinantly produced or isolated from whole 
viruses. These viral proteins suitable for use in virosomes or VLPs include proteins derived from 

30 influenza virus (such as HA or NA), Hepatitis B virus (such as core or capsid proteins), Hepatitis E 
virus, measles virus, Sindbis virus, Rotavirus, Foot-and-Mouth Disease virus, Retrovirus, Norwalk 
virus, human Papilloma virus, HIV, RNA-phages, QB-phage (such as coat proteins), GA-phage, fir- 
phage, AP205 phage, and Ty (such as retrotransposon Ty protein pi). VLPs are discussed further in 
WO 03/024480, WO 03/024481, and Refe. 26, 27, 28 and 29. Virosomes are discussed further in, for 

35 example, Ref. 30 

D. Bacterial or Microbial Derivatives 

Adjuvants suitable for use in the invention include bacterial or microbial derivatives such as: 
(1 ) Non-toxic derivatives of enterobacterial lipopolysaccharide (LPS) 
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Such derivatives include Monophosphoryl lipid A (MPL) and 3-O-deacytated MPL (3dMPL). 
3dMPL is a mixture of 3 De-O-acylated monophosphoryl lipid A with 4, 5 or 6 acylated chains. A 
preferred "small particle" form of 3 De-O-acylated monophosphoryl lipid A is disclosed in EP 0 689 
454. Such "small particles" of 3dMPL are small enough to be sterile filtered through a 0.22 micron 
5 membrane (see EP 0 689 454). Other non-toxic LPS derivatives include monophosphoiyl lipid A 
mimics, such as aminoalkyl glucosaminide phosphate (derivatives eg. RC-529. See Ref. 3 1 . 

(2) Lipid A Derivatives 

Lipid A derivatives include derivatives of lipid A from Escherichia coli such as OM-174. OM-174 is 
described for example in Ref. 32 and 33. 

10 (3) Immunostimulatory oligonucleotides 

Immunostimulatory oligonucleotides suitable for use as adjuvants in the invention include nucleotide 
sequences containing a CpG motif (a sequence containing an unmethylated cytosine followed by 
' guanosine and linked by a phosphate bond). Bacterial double stranded RN A or oligonucleotides 
containing palindromic or poly(dG) sequences have also been shown to be immunostimulatory. 

1 5 The CpG's can include nucleotide modifications/analogs such as phosphorothioate modifications and 

can be double-stranded or single-stranded. Optionally, the guanosine may be replaced with an analog . 

« 

such as 2*-deoxy-7-deazaguanosine. See ref. 34, WO 02/26757 and WO 99/62923 for examples of 
possible analog substitutions. The adjuvant effect of CpG oligonucleotides is further discussed in 
Refs. 35, 36, WO 98/40100, US. Patent No. 6,207,646, U.S. Patent No. 6,239,1 16, and U.S. Patent 
20 No. 6,429,199. 

The CpG sequence may be directed to TLR9, such as the motif GTCGTT or TTCGTT. See ret 37. 
The CpG sequence may be specific for inducing a Thl immune response, such as a CpG-A ODN, or it 
may be more specific for inducing a B cell response, such a CpG-B ODN. CpG-A and CpG-B ODNs 
are discussed in refs. 38, 39 and WO 01/95935. Preferably, the CpG is a CpG-A ODN. 

25 Preferably, the CpG oligonucleotide is constructed so that the 5' end is accessible for receptor 

recognition. Optionally, two CpG oligonucleotide sequences may be attached at their 3* ends to form 
"immunomers". See, for example, refs. 40, 41, 42 and WO 03/035836. 

(4) ADP-ribosylating toxins and detoxified derivatives thereof. 

Bacterial ADP-ribosylating toxins and detoxified derivatives thereof may be used as adjuvants in the 
30 invention. Preferably, the protein is derived from E. coli (i.e., E. coli heat labile enterotoxin "LT), 
cholera ("CT)> or pertussis ("PT")- The use of detoxified ADP-ribosylating toxins as mucosal . 
adjuvants is described in WO 95/1721 1 and as parenteral adjuvants in WO 98/42375. Preferably, the 
adjuvant is a detoxified LT mutant such as LT-K63. 

E. Human Immunomodulators 

35 Human immunomodulators suitable for use as adjuvants in the invention include cytokines, such as 
interleukins {eg. IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL T 12, etc.), interferons (eg. interferon-^, 
macrophage colony stimulating factor, and tumor necrosis factor. 

F. Bioadhesives and Mucoadhesives 
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Bioadhesives and mucoadhcsivcs may also be used as adjuvants in the invention. Suitable 
bioadhesives include esterified hyaluronic acid microspheres (Ref. 43) or mucoadhesives such as 
cross-linked derivatives of poly(acrylic acid), polyvinyl alcohol, polyvinyl pyrollidone, 
polysaccharides and carboxymethylcellulose. Chitosan and derivatives thereof may also be used as 
5 adjuvants in the invention. E.g., ref. 44. 

G. Microparticles 

Microparticles may also be used as adjuvants in the invention. Microparticles (i.e a particle of 
-lOOnm to -150/im in diameter, more preferably ~200nm to -30/im in diameter, and most preferably 
~500nm to -lOptm in diameter) formed from materials that are biodegradable and non-toxic (eg. a 
0 poly(ot-hydroxy acid), a polyhydroxybutyric acid, a polyorthocster, a polyanhydride, a 

polycaprolactone, etc ), with poly(lactide-co-glycolide) are preferred, optionally treated to have a 
negatively-charged surface (eg. with SDS) or a positively-charged surface (eg. with a cationic 
detergent, sych as CTAB). 

H. Liposomes 

* 

1 S Examples of liposome formulations suitable for use as adjuvants are described in U.S. Patent No. 
6,090,406, US. Patent No. 5,916,588, and EP 0 626 169. 

I. . Polvoxvethvlene ether and Polvoxvethvlene Ester Formulations 

■ • 

Adjuvants suitable for use in the invention include polyoxyethylene ethers and polyoxyethylene 
esters. Ref. 45. Such formulations further include polyoxyethylene sorbitan ester surfactants in 
20 combination with an octoxynol (Ref. 46) as well as polyoxyethylene alkyl ethers or ester surfactants 
in combination with at least one additional non-ionic surfactant such as an octoxynol (Ref. 47). 

Preferred polyoxyethylene ethers are selected from the following group: polyoxyethylene-9-lauryl 
ether (laureth 9), polyoxyethylene-9-steoryl ether, polyoxytheylene-8-steoryl ether, polyoxyethylene- 
44auryl ether, polyoxyethylene-35-lauryl ether, and polyoxyethylene-23-lauryl ether. 

25 J. Polyphosphate (TCPP) 

PCPP formulations are described, for example, in Ref. 48 and 49. 

# 

K. Muramvl peptides 

Examples of muramyl peptides suitable for use as adjuvants in the invention include N-acetyl- 
muramyl-L-threonyl-D-isoglutaniine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor- 
30 MDP), and N-acetylmuramyl-L-alanyl^ 
hydroxyphosphoryloxy)-ethylamine MTP-PE). 

L Imidazoauinolone Compounds . 

Examples of imidazoquinolone compounds suitable for use adjuvants in the invention include 
Imiquamod and its homologues, described further in Ref. SO and 51. 

35 The invention may also comprise combinations of aspects of one or more of the adjuvants identified 
above. For example, the following adjuvant compositions may be used in the invention: 

(1) a saponin and an oil-in-water emulsion (ref. 52); 



-46 



PATENT APPLICATION 
ATTY REFNO. PP20663.002 

♦ 

(2) a saponin (eg.., QS21) + a non-toxic LPS derivative (e.g., 3dMPL) (see WO 
94/00153); 

(3) a saponin (e.g.., QS21) + a non-toxic LPS derivative (e.g., 3dMPL) + a cholesterol; 

(4) a saponin (eg. QS21) + 3dMPL + IL-12 (optionally + a sterol) (Ref. S3); 
S combinations of 3dMPL with, for example, QS2 1 and/or oil-in- water emulsions (Ref. 54) ; 

* 

(5) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-block polymer L121, 
and thr-MDP, either microfluidized into a submicron emulsion or vortexed to generate a larger 
particle size emulsion. 

(6) Ribi™ adjuvant system (RAS), (Ribi Immunochem) containing 2% Squalene, 0.2% 
1 0 Tween 80, and one or more bacterial cell wall components from the group consisting of 

monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS), 
preferably MPL + CWS (Detox™); and 

(7) . one or more mineral salts (such as an aluminum salt) + a non-toxic derivative of LPS 

■ 

(such as 3dPML). 

« 

* 

1 5 Aluminium salts and MF59 are preferred adjuvants for parenteral immunisation. Mutant bacterial 
toxins are preferred mucosal adjuvants. 

The composition may include an antibiotic. 
Further antigens 

The compositions of the invention may further comprise one or more additional non-GAS antigens, 
20 including additional bacterial, viral or parasitic antigens. 

In one embodiment, the GAS antigen combinations of the invention are combined with one or more 
additional, non-GAS antigens suitable for use in a paediatric vaccine. For example, the GAS antigen 
combinations may be combined with one or more antigens derived from a bacteria or virus selected 
from the group consisting of N. meningitidis (including serogroup A, B, C, W135 and/br Y), 
25 Streptococcus pneumoniae, Bordetella pertussis, Moraxella catarrhatis, Tetanus, Diphtheria, 
Respiratory Syncytial virus ( -RSV), polio, measles, mumps, rubella, and rotavirus. 

In another embodiment, the GAS antigen combinations of the invention are combined with one or 
more additional, non-GAS antigens suitable for use in a vaccine designed to protect elderly or 
immunocomprised individuals. For example, the GAS antigen combinations may be combined 
30 with an antigen derived from the group consisting of Enterococcus faecalis, Staphylococcus 
aureus, Staphylococcus epidermis, Pseudomonas aeruginosa, Legionella pneumophila, Listeria 
monocytogenes, influenza, and Parainfluenza virus ('Piy 1 ). . 

Where a saccharide or carbohydrate antigen is used, it is preferably conjugated to a carrier protein in 
order to enhance immunogenicity {e.g. refs. 55 to 64). Preferred carrier proteins are bacterial toxins 
35 • or toxoids, such as diphtheria or tetanus toxoids. The CRM I97 diphtheria toxoid is particularly 

preferred (65). Other carrier polypeptides include the N.meningitidis outer membrane protein {66}, 
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synthetic peptides (67, 68), heat shock proteins {69, 70}, pertussis proteins (71, 72} , protein D from 
HJnfluenzae {73}, cytokines {74}, lymphokines, hormones, growth factors, toxin A or B from 
Cdifficile {75}, iron-uptake proteins {76}, etc Where a mixture comprises capsular saccharides from 
both serogroups A and C, it may be preferred that the ratio (w/w) of MenA saccharide:MenC 
5 saccharide is greater than 1 (eg. 2:1, 3:1,4:1, 5:1, 10:1 or higher). Different saccharides can be 
conjugated to the same or different type of carrier protein. Any suitable conjugation reaction can be 
. used, with any suitable linker where necessary. 

• ♦ 

Toxic protein antigens may be detoxified where necessary eg. detoxification of pertussis toxin by 
chemical and/or genetic means. 

1 0 Where a diphtheria antigen is included in the composition it is preferred also to include tetanus 
antigen and pertussis antigens. Similarly, where a tetanus antigen is included it is preferred also to 



preferred also to include diphtheria and tetanus antigens. 

■ 

Antigens in the composition will typically be present at a concentration of at least 1/ig/ml each. In 
1 5 general, the concentration of any given antigen will be sufficient to elicit an immune response against 

that antigen. 

* 

As an alternative to using protein antigens in the composition* of the invention, nucleic acid encoding 
the antigen may be used {eg. refs. 77 to 85}. Protein components of the compositions of the 
invention may thus be replaced by nucleic acid (preferably DNA eg. in the form of a plasmid) that 
20 encodes the protein. 



r 


w 







25 



30 



35 



Definitions 

The tern "comprising" means "including" as well as "consisting" eg. a composition "comprising" X 
may consist exclusively of X or may include something additional eg. X + Y. 

* 

The term "about" in relation to a numerical value x means, for example, x+10%. 

* 

References to a percentage sequence identity between two amino acid sequences means that, when 
aligned, that percentage of amino acids are the same in comparing the two sequences. This alignment 
and the percent homology or sequence identity can be determined using software programs known in 
the art, for example those described in section 7.7.18 of reference 86. A preferred alignment is . 









a 1 



•milium 



alty of 12 and a gap extension penalty of 2, BLOSUM matrix of 62. The Smith-Waterman 



homology search algorithm is disclosed in reference 87. 

■ 

The following example demonstrates one way of preparing recombinant GAS antigens of the 
invention and testing their efficacy in a murine model. 

EXAMPLE 1: Preparation of recombinant GAS antigens 
of the invention and Demonstration of Efficacy in Murine Model. 
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Recombinant GAS proteins corresponding to two or more of the GAS antigens of the first antigen 
group are expressed as follows. 

1. Cloning of OAS antigens for expression in E. coli 

» * 

5 The selected GAS antigens were cloned in such a way to obtain two different kinds of 

■ 

recombinant proteins: (1) proteins having an hexa-histidine tag at the carboxy-terminus (Gas-His) 

« 

and (2) proteins having the hexa-histidine tag at the carboxy-terminus and GST at the amino- 
terminus (Gst-Gas-His). Type (1) proteins were obtained by cloning in a pET21b+vector 
(available from Novagen). The type (2) proteins were obtained by cloning in a pGEX-NNH 

10 vector This cloning strategy allowed for the GAS genomic DI^A to be used to amplify the 

selected genes by PCR, to perform a single restriction enzyme digestion of the PCR products and 
to clone then simultaneously into both vectors. 

(a) Construction of pGEX~NNH expression vectors 
Two couples of complementary oligodeoxyribonucleotides are synthesised using the DNA synthesiser 

1 5 ABI394 (Peririn Elmer) and reagents from Cruachem (Glasgow, Scotland). Equimolar amounts of the 
oligo pairs (SO ng each oligo) are annealed in T4 DNA ligase buffer (New England Biolabs) for 1 0 
min in a final volume of 50 /d and then left to cool slowly at room temperature. With the described 

• • « 

procedure the following DNA linkers are obtained: 
gexNN linker 

20 Ndel Nhel Xmal EcoRI Ncol Sail Xhol SacI 

GATCXTCATATGGCTAGCCCGGGGAATTCGTCCATGGAGTGAGTCGACT^ 

GGTATACCGATCGGGCCCCTTAAGCAGGTACCTCACTCAGCTGACTGAGCTCACT 

■ 

■ 

'Not I 

* 

25 CTGAGCGGCCGCATGAA 

■ 

GACTCGCCGGCGTACTTTCGA 

• ■ 

gexNNH linker 

Hindin Notl/ Xhol Hexa-Histidine 
30 Ta^CAAGCTTGCGGCCGCACTO^^ 

GTTCGAACGCCGGCGTGAGCACGTAOAGGTAGTGGTAGTGACTATCGA 

The plasmid pGEX-KG [K. L Guan and J. E. Dixon, Anal. Biochem. 192, 262 (1991)1 is digested 
with BamHI and Hindlll and 100 ng is ligated overnight at 16 °C to the linker gexNN with a molar 

35 ratio of 3: 1 linker/plasmid using 200 units of T4 DNA ligase (New england Biolabs). After 

■ 

transformation of the ligation product in E. coli DH5, a clone containing die pGEX-NN plasmid, 

* • 

■ having the correct linker, is selected by means of restriction enzyme analysis and DNA sequencing. 
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the new plasmid pGEX-NN is digested with Sail and Hindlll and Ugated to the linker gexNNH. After 
transformation of the ligation product in E. coli DH5, a clone containing the pGEX-NNH plasmid, 
having the correct linker, is selected by means of restriction enzyme analysis and DNA sequencing. 

(b) Chromosomal DNA preparation 

5 GAS SF370 strain is grown in THY medium until OD«oo is 0.6-0.8. Bacteria are then centrifuged, 
suspended in TES buffer with lyzozyme (lOmgfail) and mutanolysine (\QW\i\) and incubated 1 hr at 
37° G Following treatment of the bacterial suspension wjth RNAase, Proteinase K and 1 0% 
Sarcosyl/EDTA, protein extraction with saturated phenol and phenol/chloroform is carried out. The 
resulting supernatant is precipitated with Sodium Acetate/Ethanol and the extracted DNA is pelletted 

1 0 by centri fiigation, suspended in Tris buffer and kept at -20° C. 

(c) Oligonucleotide design 

Synthetic oligonucleotide primers are designed on the basis of the coding sequence of each GAS 

antigen using the sequence of Streptococcus pyogenes SF370 Ml strain. Any predicted signal peptide 

* ■ 

is omitted, by deducing the 5' end amplification primer sequence immediately downstream from the 

> » 

1 5 predicted leader sequence. For most GAS antigens, the 5' tail of the primers (see Table 1 , below) 

include only one restriction enzyme recognition site (Ndel, or Nhel, or Spel depending on the gene's 

* • * 

own restriction pattern); the 3* primer tails (see Table 1 ) include a Xhol or a NotI or a Hindm 
restriction site. 



5' tails 

1 • 


3* tails 


Ndel 5' GTGCGTCATATG 3* . 


Xhol 5* GCGTCTCGAG 3 1 

« 


Nhel 5* GTGCGTGCTAGC 3* 


Nod 5* ACTCGCTAGCGGCCGC 3' 


Spel 5' GTGCGTACTAGT 3' 


Hindm 5' GCGTAAGCTT 3' 



Table 1. Oligonucleotide tails of the primers used to amplify genes encoding selected GAS 



20 antigens. 

■ 

As well as containing the restriction enzyme recognition sequences, the primers include nucleotides 
which hybridize to the sequence to be amplified. The number of hybridizing nucleotides depends on 
the melting temperature of the primers which can be determined as described [(Breslauer et al., Proc. 
Nat. Acad. Sci. 83, 3746-50 (1986 )]. The average melting temperature of the selected oligos is 50-55 

25 °C for the hybridizing region alone and 65-75 °C for the whole oligos. Oligos can be purchased from 
MWG-Biotech S.p.A. (Firenze, Italy). 

(d) PCR amplification 
The standard PCR protocol is as follows: 50 ng genomic DNA are used as template in the presence of 
0,2 /iM each primer, 200 pM each dNTP, 1 ,5 mM MgCl 2 , 1 x PCR buffer minus Mg (Gibco-BRL), 

30 and 2 units of Taq DNA polymerase (Platinum Taq, Gibco-BRL) in a final volume of 100 /d. Each 
sample undergoes a double-step amplification: the first 5 cycles are performed using as the 

hybridizing temperature of one of the oligos excluding the restriction enzyme tail, followed by 25 

i 

■ 
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cycles performed according to the hybridization temperature of the whole length primers. The 
standard cycles art as follows: 
one cycle: 

denaturation : 94 °C, 2 min 

5 

5 cycles: ^ 

denaturation: 94 °C, 30 seconds, hybridization: |l °C, 50 seconds, clongatioft: 72 °C, 1 min or 
2 min and 40 sec J 

10 25 cycles: 

denaturation: 94 °C, 30 seconds 
hybridization: 70 °C, 50 seconds 
elongation: 72 °C, 1 min or 2 min and 40 sec 

IS 72 °C, 7 min 
4°C 

The elongation time is 1 min for OAS antigens encoded by ORFs shorter than 2000 bp, and 2 min and 

■ 

40 seconds for ORFs longer than 2000 bp. The amplifications are performed using a Gene Amp PCR 

system 9600 (Perkin Elmer). 
20 To check the amplification results, 4 jd of each PCR product is loaded onto 1-1 .5 agarose gel and the 
size of amplified fragments compared with DNA molecular weight standards (DN A markers III or DC, 
Roche). The PCR products are loaded on agarose gel and after electrophoresis the right size bands are 
excised from the gel. The DNA is purified from the agarose using the Gel Extraction Kit (Qiagen) 

■ 

following the instruction of the manufacturer. The final elution volume of the DNA is 50 jd TE (10 
25 mM Tris-HCI, 1 mM EDTA, pH 8). One p) of each purified DNA is loaded onto agarose gel to 
evaluate the yield. 

• * 

(e) Digestion of PCR fragments 

One-two jig of purified PCR products are double digested overnight at 37 °C with the appropriate 
restriction enzymes (60 units of each enzyme) using the appropriate restriction buffer in 100 /xl final 
30 volume. The restriction enzymes and the digestion buffers are from New England Biolabs. After 
purification of the digested DNA (PCR purification Kit, Qiagen) and elution with 30 pi TE, 1 /d is 
subjected to agarose gel electrophoresis to evaluate the yield in comparison to titrated molecular 

* 

weight standards (DNA markers III or DC, Roche). 

(f) Digestion of the cloning vectors (pET21b+ and pGEX-NNH) 

35 10 fig of plasraid is double digested with 100 units of each restriction enzyme in 400 fil reaction 
volume in the presence of appropriate buffer by overnight incubation at 37 °G After electrophoresis 
on a 1% agarose gel, the band corresponding to the digested vector is purified from the gel using the 
Qiagen Qiaex II Gel Extraction Kit and the DNA was eluted with 50 fd TE. The DNA concentration 
is evaluated by measuring OD 260 of die sample. 
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(g) Cloning of the PCR products 
Seventy five ng of the appropriately digested and purified vectors and the digested and purified 
fragments corresponding to each selected GAS antigen are li gated in final volumes of 10-20 /il with a 
molar ratio of 1:1 fragment/vector, using 400 units T4 DN A ligase (New England Biolabs) in the 

S presence of the buffer, supplied by the manufacturer. The reactions are incubated overnight at 1 6. °G 

♦ . 

Transformation of £ coli BL21 (Novagen) and E colt BL21-DE3 (Novagen) electrocompetcnt cells is 
performed using pGEX-NNH ligations and pET21b+ ligations respectively. The transformation 
procedure is as follows: 1-2 fd the ligation reaction is mixed with SO #il of ice cold competent cells; 
then the cells are poured in a gene puiser 0, 1 cm electrode cuvette (Biorad). After pulsing the cells in 

1 0 a MicroPulser electroporator (Biorad) following the manufacturer instructions the cells are suspended 
in 0.95 ml of SOC medium and incubated for 45 min at 37 °C under shaking. 100 and 900 fi\ of cell 
suspensions are plated on separate plates of agar LB 100 ftgfad Ampicillin and the plates are 
incubated overnight at 37 °C. The screening of the transformants is done by PCR: randomly chosen 
transformants are picked and suspended in 30 yX of PCR reaction mix containing the PCR buffer, the 

1 5 4 dNTPs, 1 ,5 mM MgCh, Taq polymerase and appropriate forward and reverse oligonucleotide 
primers that are able to hibridize upstream and downstream from the polylinker of pET2 lb+ or 

■ 

pGEX-NNH vectors. After 30 cycles of PCR, 5 \i\ of the resulting products are run on agarose gel 

a 

electrophoresis in order to select for positive clones from which the expected PCR band is obtained. 

• * ■ 

PCR positive clones are chosen on the basis of the correct size of the PCR product, as evaluated by 
20 comparison with appropriate molecular weight markers (DNA markers III or DC, Roche). 

2. Protein expression 

PCR positive colonies are inoculated in 3 ml LB 100 fig/ml Ampicillin and grown at 37 °C overnight 
70 fd of the overnight culture is inoculated in 2 ml LB/Amp and grown at 37 °C until OD«o of the 
pET clones reached the 0,4-0,8 value or until OD«o of the pGEX clones reached the 0,8-1 value. 
25 Protein expression is then induced by adding 1 mM IPTG (Isopropil 0-D thio-galacto-piranoside) to 

■ 

* • 

the mini-cultures. After 3 hours incubation at 37 °C the final OD«oo is checked and the cultures are 

■ * 

cooled on ice. After centrifugation of 0.5 ml culture, the cell pellet is suspended in 50 pi of protein 
Loading Sample Buffer (60 mM TOIS-HC1 pH 6.8, 5% w/v SDS, 1 0% v/v glycerin, 0. 1% w/v 
Bromophenol Blue, 100 mM DTI) and incubated at 100 °C for 5 min. A volume of boiled sample 
30 corresponding to 0. 1 ODooo culture' is analysed by SDS-PAGE and Coomassie Blue staining to verify 
the presence of induced protein band 

3. Purification of the recombinant proteins 

Single colonies are inoculated in 25 ml LB 100 /igfail Ampicillin and grown at 37 °C overnight. The 
overnight culture is inoculated in 500 ml LB/Amp and grown under shaking at 25 °C until OD^ 0.4- 
35 0.7. Protein expression is then induced by adding 1 mM IPTG to the cultures. After 3.5 hours 

incubation ait 25 °C the final ODeoo is checked and the cultures are cooled on ice. After centrifugation 
at 6000 rpm (JA10 rotor, Beckman), the cell pellet is processed for purification or frozen at -20° C. 
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(a) Procedure for the purification of soluble His-tagged proteins from Lcoli 
(1) Transfer the pellets from -20°C to ice bath and reconstitute with 10ml 50 mM NaHP0 4 buffer, 
300 mM NaCl, pH 8,0, pass in 40-50 ml centrifugation tubes and break the cells as per the following 
outline. 

5 (2) Break the pellets in the French Press performing three passages with in-line washing. 

(3) Centrifuge at about 30-40000 x g per 15-20 min. If possible use rotor JA 25.50 (21000 rpm, 15 
min.) or JA-20 (18000 rpm, 1 5 min.) 

(4) Equilibrate the Poly-Prep columns with 1 ml Fast Flow Chelating Sepharose resin with 50 mM 
phosphate buffer, 300 mM NaCl, pH 8,0. 

1 0 (5) Store the centrifugation pellet at -20°C, and load the supernatant in the columns. 

(6) Collect the flow through. 

(7) Wash the columns with 10tal(2ml + 2ml + 4ml)50mM phosphate buffer, 300 mM NaCl, pH 
8.0. 

(8) Wash again with 10 ml 20 mM imidazole buffer, SO mM phosphate, 300 mM NaCl, pH 8.0. 

1 5 (9) Hute the proteins bound to the columns with 4.5 ml (1 .5 ml + 1 .5 ml + 1 .5 ml) 250 mM imidazole 
buffer, 50 mM phosphate, 300 mM NaCl, pH 8,0 and collect the 3 corresponding fractions of -1.5 ml 
each. Add to each tube 15 nl DTT 200 mM (final concentration 2 mM) 

♦ 

(10) Measure the protein concentration of the first two fractions with the Bradford method, collect a 
10 ng aliquot of proteins from each sample and analyse by SDS-PAGE. ( frJ.B.: should the sample be 

20 too diluted, load 21 jil + 7 1*1 loading buffer). 

(1 1) Store the collected fractions at *4°C while waiting for the results of the SDS-PAGE analysis. 

(12) For immunisation prepare 4-5 aliquots of 100 |ig each in 0.5 ml in 40% glycerol. The dilution 
buffer is the above elution buffer, phis 2 mM DTT. Store the aliquots at -20°C until immunisation. 

(b) Purification of His-tagged proteins from Inclusion bodies 
25 Purifications are carried out essentially according the following protocol: 

(1) Bacteria are collected from 500 ml cultures by centrifugation. If required store bacterial pellets at 
-20°C. For extraction, resuspend each bacterial pellet in 10 ml 50 mM TRIS-HC1 buffer, pH 8,5 on 

* 

an ice bath. 

(2) Disrupt the resuspended bacteria with a French Press, performing two passages. 

30 (3) Centrifuge at 35000 x g for 15 min and collect the pellets. Use a Beckman rotor JA 25.50 (21000 
rpm, 15 min.) or JA-20 (18000 rpm, 15 min.). 

(4) Dissolve the centrifugation pellets with 50 mM TRJS-HC1, 1 mM TCEP {Tiris(2-caiboxyethyl)- 
phosphine hydrochloride, Pierce) , 6M guanidium chloride, pH 8.5. Stir for - 10 min. with a magnetic 
bar. 

35 (5) Centrifuge as described above, and collect the supernatant. . 



r 






17* 



Chelating Sepharose (Pharmacia) saturated with Nichel according to manufacturer recommendations.. 
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Wash the columns twice with 5 ml of H,0 and equilibrate with SO mM TRJS-HC1, 1 mM TCEP, 6M 
guanidinium chloride, pH 8.5. 

• * 

(7) Load the superoatants from step 5 onto the columns, and wash with 5 ml of 50 mM TRIS-Hcl 
buffer, 1 mM TCEP, 6M urea, pH 8.5 
S (8) Wash the columns with 10 ml of 20 mM imidazole, 50 mM TRIS-HC1 , 6M urea, 1 mM TCEP, 

* 

pH 8.5. Collect and set aside the first 5 ml for possible further controls. 

(9) Bute the proteins bound to the columns with 4.5 ml of a buffer containing 2S0 mM imidazole, 50 
mM TRIS-HCI, 6M urea, 1 mM TCEP, pH 8.5. Add the elution buffer in three 1.5 ml aliquots, and 
collect the corresponding 3 fractions. Add to each fraction 15 \i\ DTT (final concentration 2 mM). 
10 (10) Measure eluted protein concentration with the Bradford method, and analyse aliquots of ca 10 
Hg of protein by SDS-PAGE. 

(1 1) Store proteins at -20°C in 40% (v/v) glycerol, 50 mM TRIS-HCI, 2M urea^ 0.5 M arginine, 2 
mM DtT, 0.3 mM TCEP, 83.3 mM imidazole, pH 8.5. 

* 

. (c) Procedure for the purification of 'GST-fusion proteins from E.coli . 
15 (1) Transfer the bacterial pellets from -20°C to an ice bath and suspend with 7,5 ml PBS, pH 7,4 to 
which a mixture of protease inhibitors (C0MPLETE™ - Boehringer Mannheim, 1 tablet every 25 ml 
of buffer) has been added 

■ 

(2) Transfer to 40-50 ml centrifugation tubes and sonicate according to the following procedure: 

a. ' Position the probe at about 0,5 cm from the bottom of the tube 
20 b. Block the tube with the clamp 

c. Dip the tube in an ice bath 

d. Set the sonicator as follows: Timer Hold, Duty Cycle 55, Out Control 6. 

e. perform 5 cycles of 10 intpulses at a time lapse of 1 minute (i.e. one cycle = 10 impulses + -45" 
hold; b. 10 impulses + -45" hold; c. 10 impulses + -45" hold; d. 10 impulses + -45" hold; e. 10 

25 impulses + -45" hold). 

(3) Centrifuge at about 3040000 x g for 15-20 min. E.g.: use rotor Beckman JA 25.50 at 21000 rpm, 
for 15 min. 

(4) Store the centrifugation pellets at. -20°C, and load the supernatants on the chromatography 
30 columns, as follows 

a 

(5) Equilibrate the Poly-Prep (Bio-Rad) columns with 0,5 ml (=1 ml suspension) of Glutathione- 
Sepharose 4B resin, wash with 2 ml (1 + 1) H 2 0, and then with,10 ml (2 + 4 + 4) PBS, pH 7,4. 

(6) Load the superoatants on the columns and discard the flow through. 

(7) Wash the columns with 10 ml (2 + 4 + 4) PBS, pH 7.4. 

35 (8) Elute the proteins bound to the columns with 4.5 ml of 50 mM TRIS buffer, 10 mM reduced 

glutathione, pH 8.0, adding 1 .5 ml + 1 .5 ml + 1 .5 ml and collecting the respective 3 fractions of ~1 .5 . 
ml each. 
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(9) Measure the protein concentration of the first two fractions with the Bradford method, analyse a 
10 ng aliquot of proteins from each sample by SDS-PAGE. (KB.: if the sample is too diluted load 21 
111 (+ 7 pi loading buffer). 

(10) Store the collected ftactiohs at +4°C while waiting for the results of the SDS-PAGE analysis. 

S (1 1) For each protein destined to the immunisation prepare 4-5 aliquots of 100 jig each in 0.5 ml of 
40% glycerol. The dilution buffo is SO mM TRIS.HCI, 2 mM DTT, pH 8.0. Store the aliquots at - 
20°C until immunisation. • 

♦ 

4. Murine Model of Protection from GAS Infection 
(a) Immunization protocol 

* 

10 Groups of 10 GDI female mice aged between 6 and 7 weeks are immunized with two or more GAS 
antigens of the invention, (20 pg of each recombinant GAS antigen), suspended in 100 jtl of suitable 
solution. Each group receives 3 doses at days 0, 21 and 45. Immunization is performed through intra- 
peritoneal injection of the protein with an equal volume of Complete Freund's Adjuvant (CFA) for the 
first dose and Incomplete Freund's Adjuvant (IFA) for the following two doses. In each immunization 

1 5 scheme negative and positive control groups are used. 

For the negative control group, mice are immunized with E. coli proteins eluted from the purification 
columns following processing of total bacterial extract from a E. coli strain containing either the 

« 

pET21b or the pGEX-NNH vector (thus expressing GST only) without any cloned GAS ORF (groups 
can be indicated as HisStop or GSTStop respectively). 
20 For the positive control groups, mice are immunized with purified GAS M cloned from either GAS 
SF370 or GAS DSM 2071 strains (groups indicated as 192SF and 192DSM respectively). 
Pooled sera from each group is collected before the first immunization and two weeks after the last 
one. Mice are infected with GAS about a week after. 

Immunized mice are infected using a GAS strain different from that used for the cloning of the 
25 selected proteins. For example, the GAS strain can be DSM 207 1 M23 type, obtainable from the 
German Collection of Microorganisms and Cell Cultures (DSMZ). 

For infection experiments, DSM 2071 is grown at 37° C in THY broth until OD^ 0.4. Bacteria are 

■ 

pelletted by centrifiigation, washed once with PBS, suspended and diluted with PBS to obtain the 
appropriate concentration of bacteria/ml and administered to mice by intraperitoneal injection. 
30 Between 50 and 100 bacteria are given to each mouse, as determined by plating aliquots of the 
. bacterial suspension on 5 THY plates. Animals are observed daily and checked for survival. 
. 5. Analysis of Immune Sera 

• * 

(a) . Preparation of GAS total protein extracts 
Total protein extracts are prepared by incubating a bacterial culture grown to ODoo 0.4-0.5 in Tris 
35 50mM pH 6.8/mutanoIysin (20 units/ml) for 2 hr at 37° C, followed by incubation for ten minutes on 
ice in 0.24 N NaOH and 0.96% p-mercaptoethanol. The extracted proteins are precipitated by 
addition of trichloroaceticacid, washed with ice-cold acetone and suspended in protein loading buffer. 
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* 

(b) Western blot analysis 

Aliqiiots of total protein extract mixed with SDS loading buffer ( 1 x ; 60 mM TRIS* HCl pH 6. 8, 5% 
w/v SDS, 10% v/v glycerin, 0.1% Bromophenol Blue, 100 mM DTT) and boiled S minutes at 95* C, 
were loaded on a 12.5% SDS-PAGE precast gel (Biorad). The gel is run using a SDS-PAGE running 
5 buffer containing 250 mM TRIS, 2. S mM Glycine and 0. 1 %S DS. The gel is electroblotted onto 

» 

nitrocellulose membrane at 200 mA for 60 minutes. The membrane is blocked for 60 minutes with 
PBS/0.05 % Tween-20 (Sigma), 10% skimmed milk powder and incubated O/N at 4* C with 
PBS/0.05 % Tween 20, 1% skimmed milk powder, with the appropriate dilution of the sera. After 
washing twice with PBS/0.05 % Tween, the membrane is incubated for 2 hours with peroxidase- 
10 conjugated secondary anti-mouse antibody (Amersham) diluted 1 :4000. The nitrocellulose is washed 
three times for 10 minutes with PBS/0.05 % Tween and once with PBS and thereafter developed by 

* 

Opti-4CN Substrate Kit (Biorad). 

(c) Preparation oj 'Paraformaldehyde treated GAS cultures 

A bacteria) culture grown to OD^oo 0.4-0.5 is washed once with PBS and concentrated four times in 
1 5 PBS/0.05 % Paraformaldehyde. Following 1 hr incubation at 37? C with shacking, the treated culture 

* 

is kept overnight at 4° C and complete inactivation of bacteria is then controlled by plating aliquots on 
THY blood agar plates. 

(d) FACS analyst^ of Paraformaldehyde treated GAS coltures with mouse immune sera 
About 10 s Paraformaldehyde inactivated bacteria are washed with 200 /il of PBS in a 96 wells U 

20 bottom plate and centrifuged for 10 rain, at 3000g, at 4°C. The supernatant is discarded and the 

bacteria are suspended in 20 pi of PBS-0. 1%BSA. Eighty id of either pre-immune or immune mouse 

m 

sera diluted in PBS-0. 1 %BS A are added to the bacterial suspension to a final dilution of either 1 : 100, 
1:250 or 1 :500, and incubated on ice for 30 min. Bacteria are washed once by adding 100 pi of PBS- . 
0. 1%BSA> centrifuged for 10 min. at 3000g, 4°C, suspended in 200 pi of PBS-0.1%BSA, centrifuged 
25 again and suspended in 10 pi of Goat Anti-Mouse IgG, F(ab , ) 2 fragment specific-R-Phycoerythrin- 
conjugated (Jackson Immunoresearch Laboratories Inc., cat.N°l 15-1 16-072) in PBS-0.1%BSA to a 
final dilution of 1 : 100, and incubated on ice for 30 min. in the dark. Bacteria are washed once by 
adding 180 pi of PBS-0.1°/oBSA and centrifuged for 10 min. at 3000g, 4°C. The supernatant is 
discarded and the bacteria were suspended in 200 pi of PBS. Bacterial suspension is passed through a 

■ * 

30 cytometric chamber of a FACS Calibur (Becton Dikinson, Mountain View, CA USA) and 1 0.000 

events are acquired Data are analysed using Cell Quest Software (Becton Dikinson, Mountain View, 

■ . • 

CA USA) by drawing a morphological dot plot (using forward and side scatter parameters) on . 
bacterial signals. An histogram plot is then created on FL2 intensity of fluorescence log scale 
recalling die morphological region of bacteria. 
35 It will be Understood that the invention has been described by way Of example only and 

modifications may be made whilst remaining within the scope and spirit of the invention. 
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FIGURE 1 : Annotation of GAS 40 
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FIGURE 2 : Schematic of GAS40: putative surface 
exclusion protein prgA (873aa) 
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